OHDSI Home | Forums | Wiki | Github

REDCap data to OMOP-CDM

(Sameep ) #1

Hi all, I am very new to the OMOP-CDM model and the OHDSI community. I work as a Machine learning analyst for the University of Chicago. We are currently using the OMOP-CDM for our clinical trials data. We were interested in storing the REDCap data to OMOP-CDM. I was unable to find and good ideas for this. I wanted to ask if anyone had any ideas or have already implemented something of this sort. Would really appreciate some help.


(Kristin Kostka, MPH) #2

@sshah988, great topic! I was just talking to our friends at Montiefore today. They have developed an OMOP to REDCap pipeline.

Send me an email Kostka(at)ohdsi.org. I’ll get you connected.

(Manlik Kwong) #3

I would think the same approach applies when mapping EHR data to OMOP as it does for REDCap. Basically exporting your REDCap to a electronic and structured document or documents and then writing an ETL to parse, map, then load the data into your OMOP CDM.

  • MK

(Selva) #4

In our case, We mapped our REDCap data (Survey data) to the observation table. May I check with you guys on whether that’s how it is done usually? I feel our data which contains survey questions and surveys are a better fit to Observation table.

(Manlik Kwong) #5


Not everything goes into the observation table. It greatly depends on what is the nature of the data in REDCap. For example if you have data pertaining to whether the patient has congestive heart disease, that goes into the condition table. Similarly, if your REDCap data is a heart rate or blood pressure value, it goes into the measurements table. When you define the data mapping, I typically look at the concept.domain_id to help tell me which table the data should be mapped and stored to.

  • MK

(Frank Fox) #7

Hi All,
I am also working on mapping survey data to the OMOP-CDM - on the EHDEN.eu project. I am mapping an ICHOM standard set (i.e. Survey results) to the OMOP-CDM and hope you can help me with the following questions.
• In the conventions for the OBSERVATION table it says that “Valid Concepts of the VALUE_AS_CONCEPT field are not enforced, but typically belong to the ‘Meas Value’ domain”. When using the OBSERVATION table for survey responses, these will then typically be from the ‘Observation’ domain. Do I understand this correctly?

• When using USAGI to generate mappings of question-responses to existing standard concepts, the suggested mappings vary between domains. For example, the two questions below are establishing the presence of prior heart or lung disease and these are USAGI’s best-guess.
Source Description Target Concept_ID Target Description Vocabulary Domain
Other heart disease 45879053 Other heart disease LOINC Meas Value
Chronic lung disease 4188164 History of chronic lung disease SNOMED Observation
My question: What criteria should be used for selecting the best match for a survey question-answer? Would you focus on description? Are Vocabulary & Domain important?

• If all the question-answers are stored in the OBSERVATION table – including those that might also be expected to be in the CONDITION_OCCURRENCE table (such as the two above), will this make the information difficult to find for others?

• A related question: How would one distinguish the source of the answer to the above two questions? e.g. a patient response, a link to an EHR record, a medical professional assessment on-the-spot.

I would be happy if you can point me to information that might help me here.
Thank you.

(Christian Reich) #8

There is a tricky point here, @FrankFox. Surveys are good to collect medical facts. But then the question is how are they represented. They can be represented in the SURVEY (not OBSERVATION) table, and you can have all the question-answer pairs you need. In fact, we can just upload your survey and make them standard concepts. The problem with that is that only people familiar with the survey will look for it, or even know it’s there. Remember, we are doing remote network research, nobody can see behind the firewall. So, what needs to happen is that the survey question-answer pairs need to be mapped to the actual facts and put into the CONDITION_OCCURRENCE, DRUG_EXPOSURE, PROCEDURE_OCCURRENCE and OBSERVATION tables, respectively. So, when in USAGI you need to do the following: Take a question-answer pair (not separately, USAGI cannot do that) and define the Domain. In that Domain, find the right concept.

For example, let’s say your Survey has the question “What chronic diseases are you suffering from?” and the answers are “Type 2 Diabetes”, “High blood pressure” and “Depression”. You then upload these pairs (e.g. “What chronic diseases are you suffering from - Type 2 Diabetes” and find the Standard Concept (Type 2 diabetes mellitus).

That is in the Type Concept. So, e.g. the PROCEDURE_OCCURRENCE record will have in condition_type_concept_id the value 581412 - Procedure Recorded from a Survey. We are in the process of consolidating these, so this particular Concept is going away soon. But the same principle will apply.


(Sameep ) #9

@krfeeney I have sent you an email from by uchicago email. thank you for the help

(Sameep ) #10

@mkwong Does REDCap have an option to directly export data into electronic structured document format?

(Gregory Klebanov) #11

@Christian_Reich on EHDEN - and just about elsewhere - we still use OMOP CDM v5.3.1. SURVEY table was introduced in CDM v.6.

btw, found this post from way back, could be relevant for this discussion

(Andrew Williams) #12

Should there be path for this in cases where it is justified because the measures are reliable, standardized, and widely used?

A rule of thumb might roughly defines criteria use to support that justification might be an origin in a curated source that promotes thoroughness of input and development and the breadth of use. That would prevent the vocabulary from becoming littered with little used or poorly conceived concepts, but allow people who want to use widely used survey/scale measure in a standard way. ICHOM, CDE on the NIH Portal, PROMIS, PhenX, widely use psychology/psychiatry/neurology scales, etc. might all be good examples of things that meet that rule of thumb.

Dima is working on this in the Psychiatry workgroup for psychiatry/psychology/neurology/neuropsychology scales and LOINC or SNOMED. It would probably be good to have a standard approach that goes beyond those psych use cases.

(Christian Reich) #13

Understood. We are working on a public local vocabulary (like a survey) management including mapping tool. Till then: Request things in the Forum. Doesn’t cost you anything.

(Manlik Kwong) #14

I am told REDCap does have export capability. The last time I did anything like this was in 2017 and dumped a REDCap project out to CSV (I think). Presently, we are about to do a demonstration project involving exporting REDCap project data to a structured electronic format for me to then map to OMOP concepts and link survey findings with EHR records in veterinary medicine. I built an OMOP (v5.2) database adapted for veterinary EHR records. Regardless, this should work the same for human data via REDCap.

(Megan Branda) #15

Hello, Redcap’s capabilities will be set at an institution level. If your institution has the API enabled then you get a token (or key code) specific to the redcap project, of which you there is a number of R packages (just google r package redcap and you will get a couple) that will export the data into a standard format for you to analyze and merge with any other data. If your institution does not have their API enabled for redcap then you are stuck doing it the old fashioned way… Logging in, selecting export, downloading the data. They do have standard code to go with the export to apply labels and formats.

(Sameep ) #16


(Michael Gurley) #17

I think when a REDCap project operates as a longitudinal registry, collecting manually chart abstracted data that possibly also incorporates pulls of data from EHRs/Claims data feeds, then it should NOT be treated as a “survey” but rather thought of as a source system that follows the normal conventions of vocabulary mapping and ETL. Using concept_type_id as a way of designating the provenance of the data being non-EHR or non-CLAIMS. That means needing to map every REDCap instrument, question and possible choice to standardized OMOP vocabulary values, and letting OMOP standard ETL practices dictate table destination. You will also need to make ad hoc decisions about populating dates, meaning deciding how the dates collected in the REDCap project relate to dates populating the OMOP clinical event tables. REDCap has a data dictionary format that exportable to a CSV.
I am working on a project that maintains mappings for a REDCap data dictionary outside of the REDCap data dictionary itself. You can partially manage the mappings to standardized vocabularies within the REDCap setup itself. By using the field_annotation capabilities of REDCap to designate a standardized vocabulary and using the choice values per REDCap data point as the standardized vocabulary native codes. I have Ruby script that allows for this to be curated and managed across updates of a REDCap data dictionary. It is almost ready to be shared, if there is any interest.

(Andrew Williams) #18

Mike we are interested. Thanks very much for your willingness to share.

(Frank Fox) #19

Thank you. That is all useful.
If I’m understanding you correctly @Christian_Reich, intimate familiarity with the source data is necessary to decide if a data-element is an OBSERVATION, MEASUREMENT, CONDITION, or PROCEDURE. The SURVEY_CONDUCT table is only a ‘master’ record for the survey instance.
I am only at proof-of-concept stage without RWD so I’ll should get by with using only the SURVEY_CONDUCT and OBSERVATION tables for now.

Yes, @Andrew it would make life easier if the most common surveys were already available. Some elements exist (e.g. HAQII) but only sporadically.

(Vojtech Huser) #20

So REDCap is used for a research study. Btw, a Clinical Study WG within OHDSI is dealing with all aspects pertaining to a study.
See this post here Registry data to OMOP CDM Work Group

(Christian Reich) #21

Correct. As @mgurley said.

The question is what is the use case. If you want to do a nominal job and say “I did it, I put REDCap into OMOP” you are fine. :slight_smile: But you probably want to use it for analytics. What kind of analytics? Only specialty analytics that makes sense with respect to the very data asset you are converting? Then an OMOP conversion is probably not necessary. Or you want to allow the data to play in the network with the OHDSI tool stack and methods. Then you better do the full job and create real Conditions, Drugs, Procedures, Devices or Visits.