OHDSI Home | Forums | Wiki | Github

What kind of data should we store in observation table?

we are stuck on observation table and need some help in what kind of data should we consider it as an observation.
in ETL Conventions from the book of OHDSI for observation table:
“Records whose Source Values map to any domain besides Condition, Procedure, Drug, Measurement or Device should be stored in the Observation table.”
what it means?
we have below tables:

  1. Lab table contains Lab tests information (test name,test date,results,unit …)
    some examples.

“Basic Screen( P2), Random” “2020-05-06” “137” “mmol/L”
“Urinalysis & Microscopy ( Urina Analysis)” “2020-05-06” “>=1.0” “mg/dL”
“Urinalysis & Microscopy ( Urina Analysis)” “2020-05-06” “Negative” “mg/dL”
“Urinalysis & Microscopy (Urina Analysis)” “2019-09-15” “Light Yellow”

  1. Diagnosis table contains (code,name,voc_id,date)
    some examples.

“E789” “Dyslipidemia” “D00006410” “1/27/2016”
“I10” “Essential hypertension” “D00009539” “1/27/2016”
“E119” “Type 2 diabetes mellitus without complication” “D00005982” “9/18/2016”
“E789” “Dyslipidemia” “D00006410” “1/27/2016”

  1. medication table contains medicine information (name,date,type,days,quantity,…)
    some examples.

“APIXN5” "Apixaban 5mg Film-coated Tablet (Eliquis) " “2021-01-10” “1” “Oral” “5.0” “10.0” “1.0” “mg”
“BTMVL” “Betamethasone Valerate 0.1% Scalp Lotion, 30 mL” “2020-12-15” “1” “External” “30”
“GNS1I3” “Granisetron HCl 1 mg/mL Injection, 3 mL Amp (LX)” “2021-08-03” “1” “Injectable” “3.0” “3.0” “1.0” “mg”

  1. Surgery table contains surgery information (code,name,icd9,date,dept_code)
    some examples.

“O00006204” “Ureteroscopic lithotripsy” “56” “2017-09-11” “UR”
“O00050148” “Repair of aneurysm in extremity” “2017-10-23” “IMN”
“O00006910” “Diagnostic hysteroscopy” “68.12” “2019-03-18” “OBGYN”

are you find any thing here belong to observation or related to?
if you can give an example of how to deal with observation it will be useful.

The tables you list will probably be sources for 1) Measurement 2) Condition Occurrence 3) Drug Exposure 4) Procedure Occurrence CDM tables. However some of the source codes may sometimes map to observations. In general for datasets the good examples of observations would be social history records, survey answers, ‘history of’ records , allergy records etc. Smoking reported habit - 1 pack a day - does not fit in any tables like Condition or Procedure, however it is important observation and should be captured somewhere. Does it make sense? Beside this not every dataset will populate all the tables in cdm. It may happened that you do not have data for say Specimen table. Try to map the codes from 1-4 and look at resulting domains.


I certainly agree with @Agnes_Wojciechowski assessment and use case. Smoking is the example that comes up for me as well.

To add to the book of OHDSI, I also constantly reference the following link. I’m directing it to v5.4 because that’s the version we’re using.

Welcome to the journey! Let us know how we can support you.

(OMOP CDM v5.4)

It is important to remember that structure of your source data may not necessary align with OMOP CDM domain structure. Concepts in OMOP CDM are organized in domains by semantic role, and that may differ from how source classification systems organize them.

In your example, Diagnosis table contains direct ICD10 codes, most of which would naturally map to Condition domain Standard concept:

This is likely either ICD10 or ICD10CM code E78.9, and in both cases you can follow ‘Maps to’ relationship to get Standard SNOMED concept, which indeed has Condition Domain.

However, not all ICD10 codes are Conditions. There are a LOT of ICD10 Observations: personal and family histories, symptoms or abnormalities that are not quite diagnoses in themselves, details of care or many other things. And your source data may still store them together in the Diagnosis table, and often for own good practical reasons – but it does not have to translate to OMOP.

There are many other examples of this. Source tables for lab tests will often contain records of medical imaging (which is Procedure), medication tables will contain bandages and blood transfusions (both Device). And this misalignment is unavoidable, because every system has it’s own definition for what diagnosis, medical product, lab test or medical intervention is.

To understand what Domain each particular concept should have you should look at what it maps to in OMOP CDM. Concepts that map to Condition domain concepts must ultimately translate to records in CONDITION_OCCURENCE table, regardless of what table source stored them in. Same with Observation and OBSERVATION.

In technical sense, you will have to create additional interim lookup tables that will link source concepts to Standard Vocabulary map targets, and sort them into OBSERVATION, MEASUREMENT, DEVICE_EXPOSURE etc according to Standard concept’s domain_id.

If you have a source concept and it does not quite map to anything, then you may need to consider the Book of OHDSI to assign a domain yourself. Again, it is not always guaranteed it will align to source table data structure, though it often will.

1 Like

thank you all for your support
@Agnes_Wojciechowski We mapped these tables like what you mentioned, and the Usagi job is done and the source_to_concept_map table was created.
@Daniel_Smith @Eduard_Korchmar
the main question is who will determine that this source data record will go to the specific table in CDM?
is it based on the concept that is mapped to in Usagi?

suppose we have a concept belonging to observation in the lab table, how do we deal with this? is it by Source to Standard query or what?
thanks again.

Athena (the vocabularies) tells you:

See the “Domain” column? That tells you any event using this concept goes in CONDITION_OCCURENCE. Every standard concept has a domain, and that tells you where to put it in the CDM.