Let’s prepare before the Vocabulathon, doing some work offline, so then we’ll get a productive 4 hours in the meeting.
If you’re interested in solving this problem, please answer
Please outline the use cases and problems with non-precise mappings.
How it affects your phenotypes and research in general.
Which vocabularies are you working with?
If you have ideas how to solve this problem, please share them
How can you participate in the discussion (only online, or also in-person during the Symposium)?
In the end of the meeting on October, 7th, we will know
what type of pain this problem does to different organizations
possible solutions to this problem, and hopefully can chose one, we can bring up to the OHDSI steering committee.
Couple of examples: Spotting in a first trimester pregnancy.
when source is mapped to 2 concepts, one concept is chosen as index event, another - as inclusion criterion happening at the same day. - not that bad when we look at one concept.
and we can’t do that simply by excluding all descendants of ‘disease in remission’, because not all cancer in remission is mapped to one code which has ‘disease in remission’ as a parent, so I listed the source codes instead.
we can’t easily exclude non-melanoma of skin, because it’s mapped uphill to malignant neoplasm
Or we can’t exclude the Migraine with cerebral Infarction from the cerebral infarction phenotype because it’s mapped both to the Cerebral infarction and Migraine with aura. (Migraine with infarction is a confusing condition, and clinicians suggested to remove it).
In theory we can say ‘no migraine on the index date’, but it becomes too complicated for the users as they can’t track the resulting set of source concepts included.
How to solve it
The idea I like the most: to make ICD10CM concepts standard if they don’t have SNOMED equivalent and if they represent distinct clinical case, which means, other and unspecified terms will not be standard.
Then these concepts will get Is_a relationship to the concepts they have Maps_to now.
All those source concepts that are mapped to several concepts and have distinct meaning will become standard as well as concepts that are mapped uphill. We can detect concepts having uphill mapping using LLM.
Then, the other ICD ontologies can be mapped to ICD10CM or SNOMED.
Note, I mostly work with the US data, that’s why I might be biased, and I’m open to the another candidates to become the standard terminologies.
Spotting in first trimester pregnancy: Cohorts need to be built with two separate criteria at any rate. Because the data could contain spotting and pregnancy separately. Not sure we need this combo concept at all.
Malignancy except non-melanoma skin cancer: This is a combination concept that isn’t even properly defined, as “non-melanoma skin cancer” is not a thing. It is the same problem as “NOS”. Why can’t we build a cohort with “malignant neoplasm” and descendants plus excluding “basal cell carcinoma of skin” and descendants? Like in 1, we have to do that anyway.
Remission is an Episode. It should not be used as an attribute of a disease concept, because, as you said, there is no way we will ever have all cancers pre-coordinated with “in remission” or “in progression”. These are so-called “Disease Dynamic” Episodes.
Migraine with infarction: Again, the separate concepts need to be in and excluded anyway, because the data might contain them separately.
Bottom line: You are providing several categories of problems with mapping of complex concepts:
AND-combos (spotting and pregnancy): just split them up and create separate inclusion criteria.
AND NOT-combos (non-melanoma skin cancer): do they actually exist as concepts? If they do, no mapping will fix that.
Combination of attributes that live in different domains: These have a problem if there is no way to link them (which in cancer we put in place). But if they don’t have a link mechanism, the only solution I see is OMOP (or SNOMED) Extension.
thanks @Christian_Reich
I fixed the link.
Cancer excluding non-melanoma skin cancer wasn’t a concept but a phenotype we had.
Probably I need to find better examples where it’s one clinical idea concept is mapped to several or to one concept with losing of significant information
Thanks for starting this thread, @Dymshyts ! Count me in.
My (and my team and Boehringer) use case: I want to create a concept set using standard concepts for a given condition. However, there are source concepts mapped directly to the “root” standard concept for that condition which I do not want to include in my concept set. This often occurs when specific ICD10-CM codes are mapped to a less-specific standard concept. In this case we are forced to create a source concept set for that condition I can compile a list of specific examples ahead of the Symposium, if it’d be useful.
Solution ideas: I like the idea of using LLMs to evaluate existing mappings and propose corrections and/or new extension concepts as applicable. The most recent GenAI Workgroup meeting discussed this use case: