OHDSI Home | Forums | Wiki | Github

Canonical set of concept ids for common measurement values?

Our group receives data from several collaborators, all of whom have done their own, independent OMOP conversion of their EHR records. Our goal is to map all this data into our own internal database. We run into problems because different institutions use different concept ids for the same measurements.
For example:
BMI can be represented as 4245997 or 3038553
Body weight can be represented as 3025315 or 3013762
Diastolic BP can be represented by 3012888 or 4154790
PSA can be represented by 42529229 or 3013603 (not exactly the same concept but are used interchangeably in EHRs)
And anti-Smith Ab can be represented by 3016921 or 3001263 (and 4 others depending on who did the conversion).

This problem was detailed in this recent publication:

(We have a poster at the OHDSI symposium on this topic too.)

While some of the OMOP concepts that map to common measurements have slight differences, in a real world implementation of an EHR dataset, they dont make any material difference.

Is there a “canonical” set of concept ids that could be (or has been) implemented uniformly to avoid these issues?


1 Like

Thanks @mcantor2 for bringing this up. This clearly needs to be addressed.

@clairblacketer Is this something we have time to address in the CDM WG session on the global symposium? If we aren’t fixing the vocabularies, we do need to create guidance on choosing SNOMED or LOINC concept for e.g. BMI (and ICD10PCS or SNOMED for e.g. ‘US of Left Hand’). Is there an implicit convention for this already…?

More relevant threads:

Same issue, but in the Procedure domain:

@mcantor2 and @MaximMoinat ,

This is a vocabulary issue, more than one standard concept_id can represent the same idea. However, the newly revived Themis WG could give guidance on vital sign Measurements until the Vocabulary team is able to make one concept_id for an idea standard and declare the rest non-standard. Most vital sign data, from the data I have seen, comes across as flowsheet data lacking any type of standard code. These all have to be custom mapped to a standard concept_id. Giving guidance on which standard concept_id would be useful for the community. I have seen some vital sign data mixed in the US EHR lab data and those have LOINC codes.

Regarding lab Measurement, Procedure or Modifier data, these usually come across as OHDSI supported codes in the EHR and get converted to standard concept_ids via standard queries used in the ETL. For the data with OHDSI supported codes, the Vocabulary team would need to make only one standard and declare the rest non-standard, if appropriate. Then map the non-standard to standards in the Concept Relationship table allowing the data to automatically transform to the standard concept.

1 Like

More relevant threads and a github issue:

Thanks for all the input and explanation. It seems like these have been open issues for a while (over a year) and, as @MPhilofsky wrote, there is a process to address them, but the process hasn’t happened.
I think the question is how does one make this happen in an open source community?

@MaximMoinat the guidance is a good idea. I expect developers would be happy to have and use a standard list too.