I’d like to get a better understanding of what the observation
table can “say” vs what the conditions
and measurements
tables can say.
My team at the University of Pennsylvania strives to make a distinction between data that has been gathered about a patient and the processes the patient participated in, like health care encounters, but also disease processes themselves.
The data I’m working with right now comes from the Synthea project if that makes any difference. I’m using their OMOP ETL with very few modifications. I have my top 20 most frequent conditions, measurements
and observations and at the end of this post.
It seems like each conditions
row could be a diagnosis, using SNOMED codes in this case (instead of ICD codes). Or am I supposed to interpret this as meaning that the patients were definitively suffering from the mentioned conditions?
It seems like many of the measurements
are objective, continuous values that one would obtain with an instrument, like a meter stick, a sphygmomanometer, or a clinical chemistry analyzer. But I also see some more subjective, categorical (or self-reported) items in there, like pain severity and smoking status.
I see histories and allergies in observations
… can’t many of those be expressed with SNOMED or ICD-10 Z codes, like you might find in disorders
? I also see BMI-based obesity findings and smoking behavior, like you might find in measurements
.
So what’s observations
“unique selling point”? Where can I read more about the intention or semantics of these tables?
thanks for reading,
Mark
select
cond.code,
cept.concept_name,
count(1) as condcount
from
cdm_synthea10.conditions cond
join cdm_synthea10.concept cept on
cond.code = cept.concept_code
group by
cond.code,
cept.concept_name
order by
count(1) desc
limit 20
my joining of the concept table is probably a little under constrained here
code | concept_name | condcount |
---|---|---|
444814009 | Viral sinusitis | 1200 |
195662009 | Acute viral pharyngitis | 721 |
72892002 | Normal pregnancy | 568 |
10509002 | Acute bronchitis | 531 |
162864005 | Body mass index 30+ - obesity | 468 |
38341003 | Hypertensive disorder | 294 |
15777000 | Prediabetes | 289 |
40055000 | Chronic sinusitis | 234 |
65363002 | Otitis media | 194 |
19169002 | Miscarriage in first trimester | 191 |
43878008 | Streptococcal sore throat | 152 |
44465007 | Sprain of ankle | 125 |
408512008 | Body mass index 40+ - severely obese | 109 |
55822004 | Hyperlipidemia | 92 |
196416002 | Impacted molars | 86 |
82423001 | Chronic pain | 83 |
55680006 | Drug overdose | 78 |
68496003 | Polyp of colon | 76 |
124171000119105 | Chronic intractable migraine without aura | 76 |
75498004 | Acute bacterial sinusitis | 74 |
select
meas.measurement_concept_id,
m_cept.concept_name,
u_cept.concept_name,
count(1) as obscount
from
cdm_synthea10.measurement meas
join cdm_synthea10.concept m_cept on
meas.measurement_concept_id = m_cept.concept_id
join cdm_synthea10.concept u_cept on
meas.unit_concept_id = u_cept.concept_id
group by
meas.measurement_concept_id,
m_cept.concept_name,
u_cept.concept_name
order by
count(1) desc
limit 20
measurement_concept_id | concept_name | concept_name | obscount |
---|---|---|---|
43055141 | Pain severity - 0-10 verbal numeric rating [Score] - Reported | No matching concept | 12701 |
43054909 | Tobacco smoking status NHIS | No matching concept | 10484 |
3036277 | Body height | centimeter | 10344 |
3012888 | BP diastolic | millimeter mercury column | 10344 |
3004249 | BP systolic | millimeter mercury column | 10344 |
3025315 | Body weight | kilogram | 10344 |
3038553 | Body mass index | kilogram per square meter | 8997 |
3000483 | Glucose [Mass/volume] in Blood | milligram per deciliter | 4577 |
3051825 | Creatinine [Mass/volume] in Blood | milligram per deciliter | 4577 |
3032503 | Calcium [Mass/volume] in Blood | milligram per deciliter | 4577 |
3018572 | Chloride [Moles/volume] in Blood | millimole per liter | 4577 |
3014094 | Carbon dioxide, total [Moles/volume] in Blood | millimole per liter | 4577 |
3004295 | Urea nitrogen [Mass/volume] in Blood | milligram per deciliter | 4577 |
3000285 | Sodium [Moles/volume] in Blood | millimole per liter | 4577 |
3005456 | Potassium [Moles/volume] in Blood | millimole per liter | 4577 |
3022192 | Triglyceride [Mass/volume] in Serum or Plasma | milligram per deciliter | 3809 |
3027114 | Cholesterol [Mass/volume] in Serum or Plasma | milligram per deciliter | 3809 |
3009966 | Cholesterol in LDL [Mass/volume] in Serum or Plasma by Direct assay | milligram per deciliter | 3809 |
3007070 | Cholesterol in HDL [Mass/volume] in Serum or Plasma | milligram per deciliter | 3809 |
3004410 | Hemoglobin A1c (Glycated) | percent | 3395 |
select
obs.observation_concept_id,
cept.concept_name, count(1) as obscount
from
cdm_synthea10.observation obs
join cdm_synthea10.concept cept on
obs.observation_concept_id = cept.concept_id
group by
obs.observation_concept_id,
cept.concept_name
order by
count(1) desc
observation_concept_id | concept_name | obscount |
---|---|---|
4060985 | Body mass index 30+ - obesity | 468 |
4256640 | Body mass index 40+ - severely obese | 109 |
4304110 | Allergy to mold | 86 |
439406 | Allergy to animal dander | 83 |
4302207 | Allergy to grass pollen | 73 |
4048169 | Allergy to house dust mite | 66 |
4306014 | Allergy to tree pollen | 62 |
433644 | Shellfish allergy | 50 |
4323208 | History of appendectomy | 41 |
4324181 | History of cardiac arrest | 41 |
45766064 | History of single seizure | 40 |
4174876 | Allergy to bee venom | 34 |
4240902 | Allergy to peanuts | 34 |
4219399 | Allergy to fish | 28 |
4169137 | Allergy to wheat | 26 |
438614 | Allergy to nut | 26 |
4163874 | History of myocardial infarction | 23 |
442116 | Allergy to eggs | 22 |
4102123 | Latex allergy | 21 |
4139681 | Allergy to dairy product | 19 |
42709996 | Smokes tobacco daily | 13 |
4038238 | Suspected lung cancer | 9 |
36684378 | Allergy to soy protein | 9 |
4058850 | H/O: lower limb amputation | 3 |
4168004 | Burn injury | 1 |