OHDSI Home | Forums | Wiki | Github

Is there a tool to extract raw data and load to REDCap or csv?

One of the services Health Data Compass, University of Colorado Anschutz Medical Campus provides is the delivery of datasets to investigators/researchers. Once an investigator finalizes a cohort and variables of interest, we take the information and write SQL directly against Epic Caboodle or the OMOP CDM. We have delivered ~740 datasets to date. So, we are looking for a tool to help us scale this process. Instead of writing SQL for every data request, is there a tool that could be used to pull the records and variables of interest for us? Or a modification of a tool already being used? Bonus points if it is compatible with the CDM and REDCap, but we are open to all ideas and suggestions!


1 Like

Hi Melanie, what kind of tool are you looking for? I have used APIs to upload and download with redcap along with formatted excel sheets. I have a PharmaSUG paper doing it with SAS and R. What did you have in mind?

Yes, I would be interested in hearing more. How involved of a tool? Against either Epic or OMOP? What is Redcaps Role in this ? Are you logging the extracts for participants and studies in RC?

At this point, we are open to all options.

Take the cohort and their variables of interest from the database and either insert them into REDCap or a UI. We need to reveal the line level / row level data. i.e. person_id, birthdate, gender, etc. And then all the clinical events and their attributes of interest. The investigator will do all analysis.

Yes, either or both!

We deliver PHI data via REDCap. It would be great if the tool uploaded the data to REDCap for us.

+ Ability to log on with credentials is a bonus!

Hi Melanie,
Can we have a call to discuss?

Hi @MPhilofsky, @leannica, and all.

If I understand correctly, you want to bring data from OMOP into REDCap. We have been working on the opposite conversion that is much more complicated and requires additional metadata added to the REDCap metadata. Converting from OMOP to REDCap should be very straightforward. Particularly, I would recommend mapping OMOP tables to REDCap forms, indicating which of the tables are repeated REDCap events (all but Person and Death), and mapping OMOP field to respective REDCap variables, reproducing data types for date, number, and string variables, and converting concept_id variables to string variables in REDCap. If you want concept_id fields become REDCap choice variables with a list of permissible values, you need to be mapping concept IDs used in every respective fields to REDCap permissible value codes and literals. I know that what I am describing is not a script :slight_smile: However, from this to a script there is one degree of separation. If you could wait, we should be building this ingestion script shortly.


Last question - how do you want to secure out what I think would be data request results in RC? By project or data group ?

Are requests going to originate in Redcap (I am assuming these are IRB data requests)

This entire thing could probably originate in RC and upon your approval fetch and populate.

Since someone just volunteered to do the work shortly I don’t want to offer up too much but this is probably another fun way to use redcap.

1 Like

Out of curiosity, would these only be one time loads into REDcap? Like a snapshot. Or possibly be updated over time? If updated over time, would you do a truncate and load into REDCap? Or an incremental load? If incremental, what stable identifier in OMOP could you use to anchor to?

@dblatt and @mgurley,

I’m not the REDCap expert at UCD. @ufuoma may have insight to the questions above. Our REDCap experts are not on this forum.

Yes, correct

We are VERY interested. Thank you! :slight_smile:

This sounds like a solution we are looking for and so, I am very interested in hearing about your work on this! Thank you so much for sharing your insight!

@mgurley some of our requests are one time loads, many others have to be updated regularly like once a month or once a quarter for example. For some reports, we load only the delta i.e., only the new data since the last upload was done. For some others, we refresh the entire database. The chosen approach depends on the researchers needs.
If incremental or a delta load, we have person_id which is the stable identifier that we use.
Please let me know if you have additional questions and thank you so much for sharing your insights!


How would only person_id be sufficient to support incremental loads? For example, if I want to load 3 measurements(a, b, c) for a person_id=1 on January 1st and then 6 measurements for person_id=1 on July 1st (a, b, c, d, e, f). And let’ say that something changed about C after January 1. My incremental load would be: (c,d,e,f). But what OMOP stable identifier could I leave in REDCap to know to update ‘c’ and not ‘a’ or ‘b’? Clinical events in OMOP have no stable provenance. So I don’t see how replicating data into REDCap could be anything other than a truncate/teload.

I assume this is the paper


Hi @mgurley, I agree with you on a truncate for the reload due to the unstable provenance. I am no REDCap expert and was referring to our regular data loads from the data in our Epic Caboodle DataWarehouse into REDCap. We have never tried these kinds of loads with OMOP data but I do know we have ways we get around doing incremental loads especially when parts of the data (instrument_id’s) have changed since the last load. I have punted these thoughts to our REDCap expert and hopefully she will be able to come on here and continue this discussion. Thanks a lot!

@Vojtech_Huser - yes that’s the paper.
Can I schedule a phone call to discuss this? I think its a very interesting issue and many centers can benefit from creative solutions. Does late Friday morning work for differing schedules?
Best wishes,

Hi @rimma, @ufuoma, @MPhilofsky, and @mgurley,

Checking in on status of REDCap ingestion or extract? We’re in the midst of finalizing the export procedures from OMOP 5.3.1 ourselves, and about to complete scripting. It occurred to me way too late to check the forums! :man_facepalming: :man_facepalming: (duplication of facepalm necessary!). We came to the same independent conclusions and steps as Rimma though, so I guess that’s something! If there hasn’t been progress, I’d be happy to share our pipeline diagram, and final scripts once developed.