[covid19] ETL help for converting your data into OMOP

Vojtech_Huser · March 30, 2020, 4:42pm

Just some notes when I tried to get the concepts for results of COVID19 RNA test.

The definition on official server - you must first login (see uper right corner) and after that, you still only see the text view of definition.

See here:
https://atlas.ohdsi.org/#/cohortdefinition/108

To see it in the usual way, I used the JSON export from atlas server and imported it into atlas-demo.
Then you see it with the codes
http://atlas-demo.ohdsi.org/#/cohortdefinition/1773844

I was after how to see the actual coded value: (there is detected via SNOMED and detected via LOINC).
This can be seen here in the JSON from here
https://raw.githubusercontent.com/edward-burn/OHDSI-COVID19-HopitalisationsCharacterisation/master/COVID%20Cohort%20Diagnostics/inst/cohorts/COVID%20ID2%20v1.json

and via picture

Vojtech_Huser · April 10, 2020, 4:00pm

If you are a site with COVID data, consider representing properly the result of the test. For covid-tested-characterization study and for prediction study, detecting properly patients tested with negative results is important. In measurement table, you may have this test represented

The guidance is to put into value_as_concept_id the following SNOMED CT codes (for positive and negative)

in OHDSI concept id world that means
Athena for positive
Athena for negative

If you think this guidance is wrong, please post an opposing view and justification.

Note that the current definition (PhenoID 30) is working with MANY positive codes:
see a list here (in italic and in picture)

_PhenoID 30 on final server it is ATLAS
_to see all values better, use this corresponding definition on covid19 dev server _
_h_ttp://atlas-covid19.ohdsi.org/#/cohortdefinition/451

aostropolets · April 10, 2020, 6:16pm

Vojtech, this is a good point. LOINC says how it should be, but it cannot force users to use these values in their ETL. So what we did was creating a comprehensive list of positive results. It will catch whatever was used in a ETL and will not hurt anybody.

Vojtech_Huser · April 10, 2020, 10:23pm

Perhaps we can use a scenario (or patient story) and a preferred way to represent it in OMOP. Let me offer one scenario

John Doe had dry cough starting on March 1. (at home)
He had fever 37.9C on March 2nd. (at home)
He had outpatient visit on March 4th.
On March 4th, he got tested using RNA test (nasal swab) (date of specimen collection)
On March 5th, the result came back and it was positive.
To facilitate return the work, he was tested again (same approach, RNA test, nasal swab) on March 16th and the result was negative.

His EHR record had covid19 added to problem list as a result of positive test on March 5th.

Some notes on OMOP (open for debate)

CONDITION_OCCURRENCE

March 4, condition_concept_id http://athena.ohdsi.org/search-terms/terms/37311061

MEASUREMENT

March 4
Measurement_concept_id : https://loinc.org/94500-6/

Value_as_concept_id = http://athena.ohdsi.org/search-terms/terms/9191 or for March 16 http://athena.ohdsi.org/search-terms/terms/9189

LINKS
https://github.com/OHDSI/Covid-19/issues
https://github.com/OHDSI/Covid-19/wiki

Similar approach for Lauren with endometriosis
https://github.com/OHDSI/Tutorial-ETL/tree/master/data/laurenCDM

Vojtech_Huser · April 13, 2020, 2:23pm

Thinking about this further - in Condition_occurrence table, it would be good to distinguish covid19 asymptomatic patient from a severe covid19. One trial used a covid severity scale. I also found this mdcalc site (Brescia-COVID Respiratory Severity Scale (BCRSS)/Algorithm)

Per snomed page SNOMED CT Coronavirus Content - Announcements - SNOMED Confluence

There are pre-release codes (not found in Athena) for covid-pneumonia and covid-ARDS. Maybe we should embrace those. (will be released in July 2020) (looks like SNOMED now embraced pre-release notion (like LOINC) (Yay!)

However, still no luck with asymptomatic covid19. Patient with positive PCR test but super mild or asymptomatic disease.

One option is to “record an observation about covid” and use OBSERVATION table Home · OHDSI/CommonDataModel Wiki · GitHub
that offers

And term like
http://athena.ohdsi.org/search-terms/terms/4309345

How sites doing NLP sites (in OMOP model) are dealing with a note indicating fever 39C three days prior outpatient visit start date. Do you populate measurement table (with some special value in measurement_type_concept_id indicating “inferred from NLP”. and using the measurement_date of [visitDate-3days] ?

Alexdavv · April 13, 2020, 8:35pm

Currently, we have a rich hierarchy of COVID forms (under this concept). It includes pneumonia, ARDS and asymptomatic. I’d not add SNOMED pre-release concepts since they may change the identifiers. Once SNOMED released them, we will remap temporary OMOP Extension concepts to SNOMED. Sounds good?

SNOMED UK added the following. Should than be enough?

concept_code	concept_name	symantic_tag
1300671000000104	COVID-19 severity scale	(assessment scale)
1300631000000101	COVID-19 severity score	(observable entity)
1300681000000102	Assessment using COVID-19 severity scale	(procedure)
1300591000000101	Low risk category for developing complication from COVID-19 infection	(finding)
1300571000000100	Moderate risk category for developing complication from COVID-19 infection	(finding)
1300561000000107	High risk category for developing complication from COVID-19 infection	(finding)

Vojtech_Huser · April 17, 2020, 2:31pm

There is a good question about PROCEDURE vs DEVICE_EXPOSURE for representing many care steps in the care for covid19. Korea notes are on the github. If there are US sites that adopted certain approach, posting you design choices here would help other sites.
This initiative will use OMOP (among others): https://covid.cd2h.org/N3C

krfeeney · April 18, 2020, 2:32pm

@Vojtech_Huser, yes! We know this initiative well. @hripcsa @cukarthik @Christian_Reich @clairblacketer @Andrew myself and others are all part of WGs on this initiative to make sure we take advantage of the community’s guidance and best practices.

schillil · May 4, 2020, 11:35pm

@krfeeney @cukarthik @andrew @clairblacketer @Christian_Reich Have you listed out common data elements for N3C-- it doesn’t seem OHDSI’s typical working style, but it seems that it is one of the requirments for this project. THanks! Lisa

Vojtech_Huser · May 11, 2020, 6:23pm

In case we want to analyze use of antibody tests - I looked at the temp LOINC codes for that
See
https://rpubs.com/vojtech_huser/temp-loinc

Also - there is now SARS-COV2 Viral Load (just like HIV viral load).
See SARS coronavirus 2 RNA [Log #/volume] (viral load) in Unspecified specimen

Alexdavv · May 26, 2020, 6:05pm

We keep adjusting to the way COVID-related facts are described in the data. Addressing such things as suspected vs real Conditions, Emergency codes, Timing context, Lab tests hierarchy, pre-coordinations and many others, we’re happy to announce COVID-19 v2.0 Vocabulary Release that is already in Athena.

If you are involved in ETL or data analysis, please make sure to take a look at the changes. After the first version we updated some rules. It’s highly recommended to re-run ETL with the recent version since the interim vocabulary versions are not supported by current or initial instructions.

Instructions are available here

Alexdavv · May 26, 2020, 6:08pm

For COVID antibody tests please use 37310258 Measurement of 2019 novel coronavirus antibody with the descendants.

Vojtech_Huser · June 5, 2020, 9:45pm

CDC recommending a nice (and complete [with result codes!] representation of covid19 tests

see excel file at
https://www.cdc.gov/csels/dls/sars-cov-2-livd-codes.html

tab LOINC mapping has:

Vojtech_Huser · September 24, 2020, 10:57pm

Good SNOMED guide for this is here

https://confluence.ihtsdotools.org/display/DOCCV19/COVID-19+Data+Coding+using+SNOMED+CT

one subpart
https://confluence.ihtsdotools.org/display/DOCCV19/2.4+Tests+and+Investigations#id-2.4TestsandInvestigations-Results

another https://confluence.ihtsdotools.org/display/DOCCV19/2.2+Patient+Demographics

Chris_Knoll · September 25, 2020, 1:26am

Sure we can, just provide one concept to represent ‘positive’ and don’t give them any other choices.

I have to disagree with this: it leaves us to do what Vojtech is showing: we have to put in every possible code that represents ‘present’. Why not just have ‘Positive’ and ‘Detected’ map over to ‘Present’? Or any other combo where there’s one ‘standard’ way to represent something is present.

Don’t think we can’t force people to do things, that’s what ‘use the standard’ means. But if we give them multiple standards to choose from, we miss the whole point of standardization.

Vojtech_Huser · November 24, 2020, 10:22pm

From this paper https://jamanetwork.com/journals/jamapediatrics/fullarticle/10.1001/jamapediatrics.2020.5052

a good guidance on OMOP-PEDSnet representation for COVID is here
https://github.com/PEDSnet/Data_Models_Public/blob/master/PEDSnet/docs/COVID-19%20Cohort.md

The chief complaint concept is interesting.
Identification of healthcare workers is also very interesting https://github.com/PEDSnet/Data_Models_Public/blob/master/PEDSnet/docs/COVID-19%20Cohort.md#healthcare-workers

PEDSnet view of type concepts seems to divert from OHDSI.
see