OHDSI Home | Forums | Wiki | Github

Problem with cdm schema and data import


(Vlasios Dimitriadis) #1

Hello everybody,

Not being really sure whether my previous posts were made on the proper category-section (Implementers), since I have tried to do an OHDSI software stack setup and I have come to a dead end, I am making this new post in the current (Developers) section. I hope it is not going to be annoying or considered as spam, I apologise in case that happens.

My previous posts (for reference) are:

and:

In short, what I have done and tried so far:

I have installed Broadsea and made the proper configurations (using https://github.com/OHDSI/Broadsea). Everything, but the data import/loading to the database, seems to be working fine.

As far as data import is concerned, I have tried with data that I have produced myself with the synthea scripts from :

as well as with data that I have downloaded from:

https://storage.googleapis.com/synthea-public/synthea_sample_data_csv_jan2020.zip

which I found at:

https://synthea.mitre.org/downloads

Finally, I have tried the import with data that @Ajit_Londhe produced using synthea scripts (as he tried to help me and I am really thankful for that).

I have tried two ways in each case, in order to load the data to the database, after creating the empty cdm-vocab (same) scheme. Bulk load and the ETLSyntheaBuilder R functions.

Unfortunately all datasets and all methods have failed.

Different kind of errors occur though, using bulk load vs the ETLSyntheaBuilder R functions.

More specifically using the R functions from ETL-synthea, data are imported into the tables that end up empty with bulk_load script (e.g. patients), but different errors occur most of which (all of them in fact) appear to have the following “cause”:

Caused by: org.postgresql.util.PSQLException: ERROR: column “organization” of relation “encounters” does not exist

Could it be that there is something going on with incompatible versions I might be using? Even though I am using vocabulary v5.x downloaded from Athena and for the ETL scripts versions I see v5.3.1 in the SQL filenames in the subdirectories. I am not sure whether the data csv files (produced or downloaded) are following a different version (since errors occuring during the import process with bulk_load complain about extra columns that do not exist in some tables???).

Do you think that I should may try Common Data Model version 6 or cutting out the offending columns (that raise error during bulk_load execution) in the data csv files?

Is there something else maybe that anyone can think of that I could try?

Thank you in advance


(Pantelis Natsiavas) #2

Hi. I also work with Vlasios posting above.

I wonder if there is a database export which could be used out of the box! Is there a database export for synpuf or synthea so that we could compare this with the database that we end up with using the ETL scripts?


t