EHR data to OMOP CDM Work Group

hripcsa · July 5, 2019, 1:06pm

(Welcome, @hersh.)

MPhilofsky · July 9, 2019, 3:19pm

Welcome, @hersh!

You are in the correct place! OHDSI = standardization. The EHR WG discusses the ETL process, the ambiguous and impossible data we uncover during the ETL process, and ideas on how to handle the unique situations we encounter. Quite a few of us have Epic data, so we know the trials and tribulations you may/most likely will encounter. OHDSI also has many other workgroups for all things OHDSI and we will point you in the correct direction. Or you can always post to the forum, people are very friendly here

Come to our next meeting. It’s Friday, July 12th @ 7am PST. Early for those on the west coast, but this time received the most votes from interested participants. Send me the email addresses of those who are interested and I will add to the meeting invite.

Free labor! Everyone who will work with the ETL process or the OMOP CDM should start by watching the tutorial videos on the OMOP CDM & Vocabulary and CDM ETL. Understanding the CDM & Vocabularies very necessary. This is not the typical ETL where one source field goes directly to the target field. The ETL is VERY complex and time consuming for EHR data. And the converted data is amazing - standardized semantical representation, standardized format, standardized software tools, ability to leverage the vocabularies for queries, gigantic community of data partners who will run the study on their data, etc.

Please come to the Symposium September 15, 2019 where you will meet and collaborate with others. It’s an amazing and very diverse group of people all working towards making lives better by utilizing data, technology and research. Here’s the link for the tutorials held on Sept. 14th & 16th. In addition to the tutorials on the CDM & Vocabularies and the ETL process, there are also tutorials for Cohort Definition/Phenotyping, Data Quality, Patient Level Prediction, and Population Level Estimation. All the courses are taught by very knowledgeable faculty who are experts in their respective fields.

ella · July 9, 2019, 6:41pm

Is there a sub-group of this group for Cerner to OMOP mapping sharing? regardless, please at least add me to this group’s email distribution list - ella.young@phsa.ca - thx !

krfeeney · July 9, 2019, 7:19pm

Great question @ella! @Daniella_Meeker had created a Cerner to OMOP group back in 2018 (https://www.ohdsi.org/web/wiki/doku.php?id=projects:workgroups:cerner_to_omop) which pre-dates this EHR group. Not sure how active this group is anymore but I know @mvanzandt’s team (@QI_omop @MichaelWichers) also have done Cerner conversions. Will follow-up with you offline to connect more dots across teams.

hersh · July 14, 2019, 12:12pm

Thank you for detailed reply, Melanie. My interns and I are getting up to speed on OHDSI, and are trying to figure out how to manage our data.

One issue concerns terms from our EHR, such as lab test names or drug names. One challenge is that the data comes to us two steps removed from its source. That is, our institution has a research data warehouse, which itself is derived from Clarity, Epic’s data warehouse, whose data is derived from the original Epic records. Our source only gets the string of the data source, and not the coded identifier.

I know that OHDSI is very oriented to controlled terminology, and as an informatician I understand the value of such terminology. But I don’t see us easily able to get to there from the data we have, and I instead would like for us to be able to capture the strings in the proper fields. Has anyone else wanted to do this and develop solutions for it?

Since our informatics research work is focused on text processing, we would also like to be able to preserve our complete notes.

Any advice on these issues would be greatly appreciated!

Bill Hersh

hripcsa · July 14, 2019, 1:37pm

Hi, @hersh. The OHDSI CDM has a single NOTE table to store all notes, with fields for specifying meta-information about the notes.

Then there is a NOTE_NLP table where we put parsed notes. With the idea that in the future, if you trust the parse enough, you can put the output into the corresponding domain table (condition, measurement, etc.).

Lab data go into the MEASUREMENT table, and the source code (which is a string in your case) goes into the measurement_source_value column of that table. Hopefully you can map source names into codes. The OMOP code for the LOINC term that they map to goes into the measurement_concept_id column. That’s the column that OHDSI studies use to identify which thing is being measured. If you don’t map at all, then you would put 0 in that column, which means that the OHDSI tools won’t easily pull those data (e.g., potassium over 3.4) and your users will need to search on things like K and potassium etc. and then figure out which ones are blood or urine or CSF.

We are going to be mapping from Epic, but I suspect each institution’s lab codes are local. We had already mapped our ancillary labs to LOINC, so we will get a direct feed from the ancillary rather than getting it from Epic and Clarity. Actually, I think we are using Cerner lab, but there, too, I think each institution has come up with its own coding scheme.

Christian_Reich · July 14, 2019, 1:39pm

@hersh:

You may not realize it, but this experience is pretty universal: Folks are faced with the duality of the original data (Clarity in your case) and a CDW (with often unclear business rules), and a myriad of different sources of data, and a long list of non-codified content. Welcome to the club. But the community is here to help.

Don’t become weak! But you don’t have to:

The strings go into source_value of the record
Their mapping to standard concepts is a manual job at the moment. If somebody were to come up with an NLP solution and provide it to the OHDSI community - I’m all for it.
We are working on an online mapping tool that remembers what everybody else did. It’s not ready yet.
You can ask the vocabulary team to map your strings. But they have a long list of things. If you have a little money you could buy that service.

As for longer text - what @hripcsa said.

MPhilofsky · July 15, 2019, 2:00pm

The EHR folks all get varying amounts of uncoded data. I’d lobby your institution to provide the code along with the string. Lobby hard because everything is much easier with some percentage of standard codes!

But if you aren’t able to obtain the codes along with the strings, there are a few different options:

OHDSI provides the Usagi tool to help with mapping. This is great if you have less than a few thousand terms, otherwise, it is very time consuming and honestly, quite tedious. I highly suggest someone with medical terminology knowledge create the mappings because it’s not always straight forward. I use Usagi for mapping Colorado’s text strings. I also use Athena when I have 200 or less terms. Athena is not a mapping tool, but its easy UI allows me to explore the term connections and hierarchy for a given string. Parents & children are important when mapping.
Only put the text string in the *_source_value field of the appropriate domain. You won’t be able to use any of the standardized tools or participate in network studies. And I don’t see any benefit to use the OMOP model if you don’t convert to standard concept_ids. But it is an option.

Per Christian:

George is correct.

Again, George is correct

krfeeney · July 23, 2019, 12:24pm

@MauraBeaton – is it possible this wonderful EHR workgroup could be added to the OHDSI wiki page (https://www.ohdsi.org/web/wiki/doku.php?id=projects:overview)?

MPhilofsky · July 23, 2019, 3:19pm

Thanks, @krfeeney! I didn’t know the WG wiki existed

MPhilofsky · July 26, 2019, 4:05am

Hello Friends!

I am cancelling the EHR WG meeting on July 26th. Our next meeting will be Friday, August 9th.

Melanie

MPhilofsky · August 21, 2019, 6:30pm

Hello all!

Our next WG meeting is this Friday, August 23rd at 10am EST. We will be discussing the trials and tribulations of mapping Epic’s encounter data to the CDM Visit table.

Background:

The definition of an encounter is different than the definition of a Visit.
One Visit may contain multiple encounters.

Please come with real world data examples, so we can dig in and discuss this in detail!

OHDSI_User · August 23, 2019, 8:26pm

Hi @MPhilofsky. Are the minutes from today’s meeting being posted? Specifically the powerpoint that had the standards for visits? Thanks.

TMS · August 24, 2019, 9:23pm

Please add me, Tarun Shah - tmshah@ismnet.com.

MPhilofsky · August 28, 2019, 6:01pm

Meeting minutes are located here.

@Robert_Winter presented the powerpoint.

MPhilofsky · August 28, 2019, 6:10pm

You’ve been added!

TMS · August 30, 2019, 10:31am

I’m trying to implement EHR data to OMOP. Can anyone help me understand the what condition_status_concept_id refers to? also how can we find these concept values in http://athena.ohdsi.org.

Thanks

rookie_crewkie · August 30, 2019, 11:06am

Hello @TMS,

Probably convention note #10 from Condition Occurrence description in wiki might be helpful.

TMS · August 30, 2019, 11:15am

@rookie_crewkie Thank you, that’s helpful !!

TMS · September 3, 2019, 12:13pm

Am I correct, when following http://athena.ohdsi.org

Lab -> Domain=Measurement, Class=Lab Test
Vitals -> Domain=Measurement, Class=Observable Entity
Radiology -> Domain=Measurement, Class=Clinical Observations

Also what else we can map in measurement table?