OHDSI Home | Forums | Wiki | Github

Populating source_value fields

We have source data that uses their own custom IDs to record a diagnosis. They provide a custom lookup table that maps the custom IDs to ICD10 codes. We then take those ICD10 codes to map to a target concept ID in the OMOP CDM.

What should we store in condition_source_value? Custom IDs or ICD10 codes?

  • The custom IDs provide no meaning to the user but act as a link from the source data to the CDM
  • The ICD10 codes have meaning and allows users to search ICD10 codes through the source_value field. However, this breaks the link between the source data and the CDM.

The description for the condition_source_value in the OMOP CDM Specification:
The source code for the condition as it appears in the source data. This code is mapped to a standard condition concept in the Standardized Vocabularies and the original code is stored here for reference.

The more opinions we have the better. Thank you!

You should distinguish between custom IDs that are simply lookups for ICD10 codes (i.e., they are just concept ids created to facilitate access to ICD10 and are meaningless), and custom IDs that are their own vocabulary. In CPRD, for example, they have medcodes and prodcodes that are meaningless numeric identifiers, and so Read (medcodes) and Gemscript (prodcode) are better for source vocabulary data.

@Mark_Danese They are simply lookups for ICD10 codes.

I assume that you will be putting the OMOP codes for the ICD10-CM codes in the source concept id fields, so you are free to put the original source phrases in the source value fields, thus covering both bases.

George

@hripcsa This is an option, but unfortunately many of our clients still want version 4.0 and there aren’t any source_concept_id fields in that version of the CDM.

Perhaps this is good motivation to get them to adopt a current standard:)

t