OHDSI Home | Forums | Wiki | Github

EHR data to OMOP CDM Work Group


(Selva Muthu Kumaran Sathappan) #54


How do I participate in the work group meetings? I already requested my name to be added to the group but couldn’t join earlier. Apologies and can I know how I can dial in?

(Selva Muthu Kumaran Sathappan) #55


Can you also let me know whether WG is open for people from all backgrounds and even for beginners like me? I always make sure to post my queries in forum but I also feel WG can help me learn and improve. Kindly request you to let me know


(Maura Beaton) #56


OHDSI is an open community and all working groups are open to anyone who feels they are able to contribute. A full list detailing the aims of each OHDSI working groups is available here:

And up-to-date dial-in information is available here:

(Melanie Philofsky) #57

Welcome! I just sent you the calendar invite. Our next meeting is Friday, June 14th at 10am EST

(Melanie Philofsky) #58

Hello all!

Our next EHR WG meeting is tomorrow, Friday, June 14th at 10am EST. We do not have a speaker or topic on the agenda. So, this will be an open discussion. The connection details:

Zoom meeting details:

Join from PC, Mac, Linux, iOS or Android: https://ucdenver.zoom.us/j/4984831362

Or iPhone one-tap :
US: +16468769923,4984831362# or +16699006833,4984831362#
Or Telephone:
Dial(for higher quality, dial a number based on your current location):
US: +1 646 876 9923 or +1 669 900 6833
Meeting ID: 498 483 1362
International numbers available: https://zoom.us/u/9hXjQ

(Melanie Philofsky) #59

Meeting details have changed. Please join this Zoom instead.

(Selva Muthu Kumaran Sathappan) #60

Thank you. Joining in

(Bill Hersh) #61


I am interested in learning more about this group. I know some of you who are in the group already and look forward to meeting others.

My main interest is in moving a large extract of EHR that started from our Epic system and has been transformed into the format of our research data warehouse into a more standardized format. The purpose of putting the data into OMOP CDM is to allow more standardized methods to be developed for information retrieval tasks, such as cohort retrieval.

Is that an interest of this group? And if so, what would be the best way to get involved? I also have some summer interns this summer who can also get involved.

Bill Hersh
Oregon Health & Science University

(George Hripcsak) #62

(Welcome, @hersh.)

(Melanie Philofsky) #63

Welcome, @hersh!

You are in the correct place! OHDSI = standardization. The EHR WG discusses the ETL process, the ambiguous and impossible data we uncover during the ETL process, and ideas on how to handle the unique situations we encounter. Quite a few of us have Epic data, so we know the trials and tribulations you may/most likely will encounter. OHDSI also has many other workgroups for all things OHDSI and we will point you in the correct direction. Or you can always post to the forum, people are very friendly here :slight_smile:

Come to our next meeting. It’s Friday, July 12th @ 7am PST. Early for those on the west coast, but this time received the most votes from interested participants. Send me the email addresses of those who are interested and I will add to the meeting invite.

Free labor! Everyone who will work with the ETL process or the OMOP CDM should start by watching the tutorial videos on the OMOP CDM & Vocabulary and CDM ETL. Understanding the CDM & Vocabularies very necessary. This is not the typical ETL where one source field goes directly to the target field. The ETL is VERY complex and time consuming for EHR data. And the converted data is amazing :slight_smile: - standardized semantical representation, standardized format, standardized software tools, ability to leverage the vocabularies for queries, gigantic community of data partners who will run the study on their data, etc.

Please come to the Symposium September 15, 2019 where you will meet and collaborate with others. It’s an amazing and very diverse group of people all working towards making lives better by utilizing data, technology and research. Here’s the link for the tutorials held on Sept. 14th & 16th. In addition to the tutorials on the CDM & Vocabularies and the ETL process, there are also tutorials for Cohort Definition/Phenotyping, Data Quality, Patient Level Prediction, and Population Level Estimation. All the courses are taught by very knowledgeable faculty who are experts in their respective fields.

(Ella) #64

Is there a sub-group of this group for Cerner to OMOP mapping sharing? regardless, please at least add me to this group’s email distribution list - ella.young@phsa.ca - thx !

(Kristin Kostka, MPH) #65

Great question @ella! @Daniella_Meeker had created a Cerner to OMOP group back in 2018 (https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:cerner_to_omop) which pre-dates this EHR group. Not sure how active this group is anymore but I know @mvanzandt’s team (@QI_omop @MichaelWichers) also have done Cerner conversions. Will follow-up with you offline to connect more dots across teams.

(Bill Hersh) #66

Thank you for detailed reply, Melanie. My interns and I are getting up to speed on OHDSI, and are trying to figure out how to manage our data.

One issue concerns terms from our EHR, such as lab test names or drug names. One challenge is that the data comes to us two steps removed from its source. That is, our institution has a research data warehouse, which itself is derived from Clarity, Epic’s data warehouse, whose data is derived from the original Epic records. Our source only gets the string of the data source, and not the coded identifier.

I know that OHDSI is very oriented to controlled terminology, and as an informatician I understand the value of such terminology. But I don’t see us easily able to get to there from the data we have, and I instead would like for us to be able to capture the strings in the proper fields. Has anyone else wanted to do this and develop solutions for it?

Since our informatics research work is focused on text processing, we would also like to be able to preserve our complete notes.

Any advice on these issues would be greatly appreciated!

Bill Hersh

(George Hripcsak) #67

Hi, @hersh. The OHDSI CDM has a single NOTE table to store all notes, with fields for specifying meta-information about the notes.

Then there is a NOTE_NLP table where we put parsed notes. With the idea that in the future, if you trust the parse enough, you can put the output into the corresponding domain table (condition, measurement, etc.).

Lab data go into the MEASUREMENT table, and the source code (which is a string in your case) goes into the measurement_source_value column of that table. Hopefully you can map source names into codes. The OMOP code for the LOINC term that they map to goes into the measurement_concept_id column. That’s the column that OHDSI studies use to identify which thing is being measured. If you don’t map at all, then you would put 0 in that column, which means that the OHDSI tools won’t easily pull those data (e.g., potassium over 3.4) and your users will need to search on things like K and potassium etc. and then figure out which ones are blood or urine or CSF.

We are going to be mapping from Epic, but I suspect each institution’s lab codes are local. We had already mapped our ancillary labs to LOINC, so we will get a direct feed from the ancillary rather than getting it from Epic and Clarity. Actually, I think we are using Cerner lab, but there, too, I think each institution has come up with its own coding scheme.

(Christian Reich) #68


You may not realize it, but this experience is pretty universal: Folks are faced with the duality of the original data (Clarity in your case) and a CDW (with often unclear business rules), and a myriad of different sources of data, and a long list of non-codified content. Welcome to the club. But the community is here to help.

Don’t become weak! :slight_smile: But you don’t have to:

  • The strings go into source_value of the record
  • Their mapping to standard concepts is a manual job at the moment. If somebody were to come up with an NLP solution and provide it to the OHDSI community - I’m all for it.
  • We are working on an online mapping tool that remembers what everybody else did. It’s not ready yet.
  • You can ask the vocabulary team to map your strings. But they have a long list of things. If you have a little money you could buy that service.

As for longer text - what @hripcsa said.

(Melanie Philofsky) #69

The EHR folks all get varying amounts of uncoded data. I’d lobby your institution to provide the code along with the string. Lobby hard because everything is much easier with some percentage of standard codes! :slight_smile:

But if you aren’t able to obtain the codes along with the strings, there are a few different options:

  1. OHDSI provides the Usagi tool to help with mapping. This is great if you have less than a few thousand terms, otherwise, it is very time consuming and honestly, quite tedious. I highly suggest someone with medical terminology knowledge create the mappings because it’s not always straight forward. I use Usagi for mapping Colorado’s text strings. I also use Athena when I have 200 or less terms. Athena is not a mapping tool, but its easy UI allows me to explore the term connections and hierarchy for a given string. Parents & children are important when mapping.
  2. Only put the text string in the *_source_value field of the appropriate domain. You won’t be able to use any of the standardized tools or participate in network studies. And I don’t see any benefit to use the OMOP model if you don’t convert to standard concept_ids. But it is an option.

Per Christian:

George is correct.

Again, George is correct :slight_smile:

(Kristin Kostka, MPH) #70

@MauraBeaton – is it possible this wonderful EHR workgroup could be added to the OHDSI wiki page (https://www.ohdsi.org/web/wiki/doku.php?id=projects:overview)?

(Melanie Philofsky) #71

Thanks, @krfeeney! I didn’t know the WG wiki existed :slight_smile:

(Melanie Philofsky) #72

Hello Friends!

I am cancelling the EHR WG meeting on July 26th. Our next meeting will be Friday, August 9th.


(Melanie Philofsky) #73

Hello all!

Our next WG meeting is this Friday, August 23rd at 10am EST. We will be discussing the trials and tribulations of mapping Epic’s encounter data to the CDM Visit table.


  • The definition of an encounter is different than the definition of a Visit.

  • One Visit may contain multiple encounters.

Please come with real world data examples, so we can dig in and discuss this in detail!