OHDSI Home | Forums | Wiki | Github

Conversion of Andromea Table to .`data.frame` to `pandas.DataFrame`

Hi All!

At IKNL we are working on a python wrapper for several OHDSI tools (DatabaseConnector, FeatureExtraction, etc.). Currently I am working on the FeatureExtraction module and specifically on the get_db_covariate_data function. This function should -when finished- return a pandas DataFrame.

Conversion from the R data.frame to pandas Dataframe is easy. However I found that the getDbCovariateData function outputs a andromeda Table S4 class. I am no where near an R expert, but I was able to create a regular data.frame from it by simply calling data.frame(andromeda_obj). Is this the way? Or is there a beter way to do this?

I found how to construct an Andromeda Table here: Using Andromeda • Andromeda (ohdsi.github.io) so conversion the other way around should be simple.

An Andromeda object is currently a SQLite database, so probably in Python you could access it through something like PySQLite?

You could just convert an Andromeda table to an R data frame, and convert it to a Pandas dataframe from there. But the idea of Andromeda is that is can hold tables too large to fit in memory, so that approach would fail for large tables (and FeatureExtaction tables can be very large).

Note that we’re exploring Arrow as an alternative backend for Andromeda, but that work is currently stalled due this issue

Thanks! Now the Andromeda object makes much more sense. The end goal for us is to send the covariates table through an API to another application. It might indeed be more appropriate to send the entire SQLite database rather than an unpacked pandas data frame.

Thanks again for the insight.

t