Thank you for your effort in making this sample dataset. It is really helpful for someone like me who is new to OHDSI. I still have a few questions about setting up the cdm schema and wonder if you or someone else could help me out.
execute the script OMOP CDM ddl - PostgreSQL.sql to create the tables and fields, which generates 37 tables under cdm schema
load the data into the schema using the 18 csv files in the sample dataset
then I ran the script OMOP CDM constraints - PostgreSQL.sql to add the constraints. However, I encountered a lot of errors when running this script, for example, here is one of the error:
ERROR: insert or update on table âpersonâ violates foreign key constraint âfpk_person_gender_conceptâ
DETAIL: Key (gender_concept_id)=(8507) is not present in table âconceptâ.
I think this is caused by the concept table which is currently empty. And I found that in the link mentioned above, there is an extra step to load CMD vocabulary, which seems to populate the concept table.
So, my question is where can I find the CMD vocabulary csv files?
I think thatâs a script bug: gender_concept_id is a column in the âpersonâ table not the âconceptâ table. Could you post here the actual statement that creates the foreign key constraint?
Ok, Iâm sorry, I misread the error: itâs explaining that the value â8507â in the gender_concept_id column of Person is not found in the table âconceptâ. Thatâs actually a really good message even tho I misread it.
It means that your vocabuary tables arenât populated, you need to load them (via download from Athena) before applying those key constraints. Then youâll have all the necessary concepts in the âconceptâ table, and the foreign keys should work.
Where there instructions in the synpuf guide that instructed how you load the vocabulary tables?
@Chris_Knoll I couldnât find direct instructions for loading the vocabulary tables. But I found a script to load vocabulary tables in this link: GitHub - OHDSI/CommonDataModel at v5.2.2 â PostgreSQL â VocabImport.
It looks like we need the following csv files:
DRUG_STRENGTH.csv
CONCEPT.csv
CONCEPT_RELATIONSHIP.csv
CONCEPT_ANCESTOR.csv
CONCEPT_SYNONYM.csv
VOCABULARY.csv
RELATIONSHIP.csv
CONCEPT_CLASS.csv
DOMAIN.csv
I looked at Athena, and was overwhelmed by the large amount of data on it . Could you please tell me which vocabulary data should I download? I only need some sample data to get ATLAS up and running.
Thank you very much for your help! I really appreciate it.
I think youâre right @ericaVoss, I donât know why i didnât think of that: if you build a CDM, you should have the vocabulary attached to it. If you load some other version of the vocabulary into the CDM, you could get some strange results. So the SynPUF should probably have those vocabs pre-loaded.
Yup! The only difference in what is in the Vocab is we only share the free vocabularies, or the ones you donât need a license for. But that is plenty fine for Synpuf.
Hi @lee_evans,
I recently decided to switch over to CDM version 5.3.1. Unfortunately, i have not managed to find a SynPUF sample for this version. Is there a data set available or should I ask @a_cse for the mentioned script?
Thanks in advance
Mirko
Thanks @Mirko! Itâs really appreciated that you provided these files I can confirm that they load successfully in a postgres database and the indexes & constraints for OMOP v5.3.1 run without error. I will comment back here if I find anything wrong but equally if anyone wants the SQL code (slightly modified version of CommonDataModel/PostgreSQL at v5.3.1 ¡ OHDSI/CommonDataModel ¡ GitHub) to import @Mirkoâs Synpuf files and the standard vocabulary csvs into postgres, feel free to @ me here.
Hello, @tystan, following up on your post from April 8th⌠I would be very interested in your SQL Server scripts to load the SynPUF files and vocabularies into the CDM 5.3.1. Please, could you share those?
Also, many thanks to @Mirko and @tystan for pursuing the SynPUF 1K files for CDM 5.3.1. Looking forward to using this.
Hi @yradsmikham, I have messaged you my email so I can send the files to you (canât upload the zipped files even after trying to change the suffix to .jpg). Thanks, Ty
I assume itâs no longer relevant to you since this is a post from three years ago, but in case someone stumbles on this thread after encountering this error, like I did -
what solved it to me, was downloading ALL relevant vocabularies from https://athena.ohdsi.org/.
Originally, I followed the instructions here: https://github.com/OHDSI/ETL-CMS/tree/master/python_etl
which state:
âDownload vocabulary files from http://www.ohdsi.org/web/athena/, ensuring that you select at minimum, the following vocabularies: SNOMED, ICD9CM, ICD9Proc, CPT4, HCPCS, LOINC, RxNorm, and NDC.â
My mistake was only downloading these vocabularies that they state as a MINIMUM.
This caused these errors, since not all required concepts were downloaded.
I suggest instead to download ALL vocabularies that are selected by default.