OHDSI Home | Forums | Wiki | Github

Converting User-Defined Data Dictionaries into the OMOP CDM

Hi all,

I’m very new to OMOP so I apologize if this is a very basic problem that I haven’t figured out how to solve.

I’m working on a project that hopes to harmonize disparate data dictionaries into one standardized model (OMOP CDM). I’ve been looking up ways to do the data mapping into the CDM (e.g. White Rabbit, Rabbit in a Hat, Usagi, etc.) but I can’t seem to figure out how to convert user-defined data dictionary variables into standard OMOP CDM format. These data dictionaries from researchers are often in flat files (Excel) and each table in the spreadsheet can be converted into a CSV file.

Is there a way to map these (even with some restructuring of the source data file) into the CDM via a script or platform of some sort instead of having to manually create tables by hand?

If there’s a way to use White Rabbit and Rabbit in a Hat to do the mapping I’m just not sure how the CSV files should be formatted so that it is scanned properly and a Scan Report is generated successfully.

I appreciate any feedback or additional resources I should look into to be more familiarized. Thanks so much!

@lecksicon:

(Funny spelling of lexicon, that is).

It’s not quite clear what you mean by “harmonize disparate data dictionaries into one standardized model (OMOP CDM)”. There are two aspects to this: Converting the actual patient data into the OMOP CDM, or mapping the coding scheme (or verbatim data fields if the data are not coded) to the Standardized Vocabulary. This is explained roughly here.

The “rabbits” are for doing the former. The latter is not that straightforward, but there is a very useful tool called Usagi that can help you if all you have are description strings.

Let us know which problem you are trying to solve.

Thanks for replying @Christian_Reich! I’m glad you were able to clarify these tools for me. What I’m working with is the mapping of the coding scheme portion of the problem. We have these study variables from various projects that we would like to convert into the CDM. I did try to play around with Usagi but I guess what I’m having difficulty with is how the data dictionary files should be organized or formatted prior to loading them into Usagi. The guide explained that the file should include these variables: SOURCE_CODE and SOURCE_CODE_DESCRIPTION, as well as FREQUENCY (this I’m not quite sure how to calculate). Am I missing something else?

Thank you!

I know that this question was asked long ago, but I’m new to OMOP CDM and just found this forum, out of which I learned alot of course.
So you said:

Converting the actual patient data into the OMOP CDM

Well we have a database of patient and clinical data. What we thought before was that the OMOP CDM is designed only for clinical data, and not patient data. Let me explain breifly our use case.

We are in charge of a patient cohort concerning intestine and bowel diseases, we send annual questionnaires to patients and their GP, so by patient data I mean the questionnaires filled by the patients, and clinical data are the feilds filled by the GPs.

So are we getting it right? Is OMOP CDM only designed to modelize the clinical data?

Thanks,

Mahtab

t