OHDSI Home | Forums | Wiki | Github

Standardizing Units for Measurement


(Melanie Philofsky) #1

Use Case:

While investigating an Atlas study which returned very few Persons, I found the Units for the specified Measurements in the inclusion criteria did not align with the Units for those Measurements in our source data. Upon further inspection of the source data, I found a Measurement could have 11 different unit_concept_ids. The Units for the Measurements of interest in Altas aligned with LOINC’s example Units for Measurements, as Dmytry links in this thread, however, real world data doesn’t always align with the suggestions.

Background:

@Vojtech_Huser completed a network study, Facilitating analysis of measurements data though stricter model conventions: Exploring units variability across sites, on the most frequent Unit used for a Measurement. He has created a csv of the results for the preferred single units for 375 tests He presented his work to Themis and it was ratified a couple years ago.

Open issues:

  1. How do we implement the “preferred” unit? Per Dima, “we need to create relationships from Measurement to preferred unit in concept_relationship table”.
  2. The original numeric result & unit_source_concept_id aren’t represented (numeric_source_value and unit_source_value) in the Measurement table. See Vojtech’s github issue found here.

Recommendations:

Standardize the Unit for the top 375 Measurements identified in Vjotech’s study. Then as use cases arise, the community can submit issues to Github to include additional Measurements for Unit standardization.

  1. Create relationships to preferred concepts in the Concept Relationship table.
  2. Create a solution to store the original numeric results and units in the Measurement table.
  3. Provide ETL guidance, on how to implement the value_as_number transformation for a unit_source_concept_id to the standardized unit_concept_id.
    a. Community sourced conversion math? Researchers currently have to convert numeric results into a standard to run studies. We should glean this information from available resources.
  4. Create DQD checks on the above.

Tagging: @Christian_Reich, @mik, @Alexdavv, @gregk


(Anna Ostropolets) #2

@MPhilofsky Thanks for bringing it up again! Indeed, measurements are a huge pain in any study. I’d imagine that a large ETL effort would be needed to adopt standardization practices. Additional issue currently handled at the study design stage is dealing with measurements with no units and identifying implausibly high/low value-unit combinations.

Another intermediate option that would be beneficial to community is create a library of measurement-unit pairs that can serve as a reference (for example, as a part of PHOEBE recommender system). Seems that @Vojtech_Huser
has done most of the work for that, so now it’s a matter of pushing his knowledge base to a more convenient place.


(Melanie Philofsky) #3

I’m curious about this process. Some Measurements don’t necessarily need Units because there is usually only one associated Unit (heart rate), but others need Units to be interpreted correctly (weight) and others are more ambiguous. Is there a standardized approach for this?


(Melanie Philofsky) #4

@Christian_Reich and/or @mik

Thoughts on the above?


(Vojtech Huser) #5

quick reply to your points

  1. you need a triplet. lab test, bad unit, preferred unit (and what you propose only takes a pair)
  2. this is almost solved. But there is one field missing (it has been submitted to CDM repo) (orig. units I think)
  3. there is AoU knowledge base for that. (I posted that link on the forum). N3C has also their solution to that (maybe also posted publicly somewhere)
  4. Clair knows that this is something I am super motivated to submit PRs to DQD for. It was not on recent priority shortlist. I think that this item is the best way to nudge the network little by little. Super motivated for this fourth item.

EDIT: links are
1 Converting lab results into preferred unit at ETL time
2 https://github.com/OHDSI/CommonDataModel/issues/259


(Christian Reich) #6

It’s @Vojtech_Huser’s proposal, essentially. For each test, create a standard unit. During ETL all MEASUREMENT records would be normalized this standard, and for that two things need to happen:

  • Determination what unit is used in the source, if it is not clear. Which it often is not, believe it or not. In many cases, you can only guess from the distribution of value which unit it is likely to be.
  • Determination of the conversion factor.

But before we figure this out let’s plan on finishing the job of attaching standard units per LOINC concept.


(Melanie Philofsky) #7

Ok, Vojtech’s study identified standard units for the top 375 Measurements. Do we need more? Or is 375 sufficient?

Is the next step to create a Vocabulary Github issue? Or how do I move this along? It stalled long ago.


(Matt Spotnitz) #8

This is a very important issue. I agree that harmonizing units and reference intervals is a worthwhile area of research. Another harmonization challenge is that an assay for a biological or chemical entity can have more than one concept. Consequently, different sites may use different concepts for the same measurement. I recommend creating measurement concept sets to improve semantic interoperability of data from the measurement domain, and have done some preliminary research with that idea. Please let me know if you would like to collaborate on measurement concept harmonization in OHDSI. @cukarthik


(Vojtech Huser) #9

I would like to collaborate with you Matt on this


(Christian Reich) #10

We should find out from @aostropolets’ concept prevalence.

Done.

Correct. We need to de-standardize all but one Measurement concept of the same meaning. Github issue here.


(Vojtech Huser) #11

relevant DQD issue link is https://github.com/OHDSI/DataQualityDashboard/issues/176

and https://github.com/OHDSI/DataQualityDashboard/issues/112


(Melanie Philofsky) #12

As you are probably aware, PEDSnet has done some work in this area.


t