OHDSI Home | Forums | Wiki | Github

Condition mapping improvement using SNOMED Extension proposal

@Christian_Reich, @ericaVoss, @aostropolets, @abedtash_hamed, @Eldar, @Chris_Knoll, all

Proposal itself:
To add SNOMED-like concepts with SNOMED-like relationships and hierarchy
that allows us to make accurate condition mappings.

below is the rationale

Mapping from ICD9CM and ICD10CM are pretty good and equal in general.

But sometimes we meet a bunch of problems related to inequality of the ontologies.

Here are the examples of mappings where we can’t find exact match.
The cases below are grouped by the way of mapping:

Case 1. Mapping to a more general concept

M99.03 “Segmental and somatic dysfunction of lumbar region” Maps to 203708004 “Segmental and somatic dysfunction”
M99.00 “Segmental and somatic dysfunction of head region” Maps to 203708004 “Segmental and somatic dysfunction”
G96.11 “Dural tear” Maps to 15758002 “Disorder of meninges”
I77.812 “Thoracoabdominal aortic ectasia” Maps to 26660001 Dilatation of aorta

Advantage:
Although we don’t have corresponding concept, we can put the condition into CDM. And it will work good in case if cohort definition includes some general criteria (i.e. Segmental and somatic dysfunction regardless of region, Dilatation of aorta regardless of the region).
Disadvantage:
It will not work at all if somebody wants to find patients with “Segmental and somatic dysfunction of head region”, opposite the cohort will include all the patients with Segmental and somatic dysfunction of any region, that makes much bigger count

Case 2. Mapping to a several concepts describing condition and co-occurring condition

E10.321 "Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema Maps to 138881000119106 “Mild nonproliferative retinopathy due to type 1 diabetes mellitus”
E10.321 "Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema Maps to 312912001 “Diabetic macular edema”

G43.001 “Migraine without aura, not intractable, with status migrainosusMaps to 230467008 “Status migrainosus”
G43.001 Migraine without aura, not intractable, with status migrainosus Maps to 56097005 “Migraine without aura”

Advantage: although we don’t have corresponding concept, we can fully represent the patient condition in CDM, both conditions will have the same visit_occurrence_id (if given) and the same visit_start_date, visit_end_date.
So if the cohort definition includes the patients exactly with “Migraine without aura, not intractable, with status migrainosus”, you need to find the patients with descendant concepts of “Migraine without aura” and “Status migrainosus” within the same visit_occurrence_id (if given) and visit_start_date, visit_end_date.
Disatvantage: Such a process of grouping by visit_occurrence_id (if given) and visit_start_date, visit_end_date makes the process of adding patients to the cohorts more complicated and less transparent and requires Atlas quries modification also.

Case 3. Mapping to a same-meaning concept but with different hierarchy (ICD10CM and SNOMED uses different principles for hierarchy sometimes)

E10.21 “Type 1 diabetes mellitus with diabetic nephropathy” Maps to 421893009 “Renal disorder associated with type 1 diabetes mellitus”
421893009 “Renal disorder associated with type 1 diabetes mellitus” doesn’t have “Type 1 diabetes mellitus” as an ancestor, it actualy belongs to a Renal disease branch
So making cohort of Type 1 diabetes mellitus patient including descendant concepts of “Type 1 diabetes mellitus” miss the patients with “Renal disorder associated with type 1 diabetes mellitus”. Thus to make a cohort definition the user needs to make complex cohort definition taking diabetic complication separately

F11.1 “Opioid abuse” Maps to 5602001 “Opioid abuse” 438130
while F11 “Opioid related disorders” Maps to 14784000 “Opioid-induced organic mental disorder”
And again, 5602001 “Opioid abuse” isn’t a child of 14784000 “Opioid-induced organic mental disorder”. So to find all the patients who used opioids the user need to make complex cohort definition, not just take all the descendants of F11 “Opioid related disorders” standard concept.

As you can see ontologies mismatch can lead to a wrong cohort definition.
Such cases are not very often, but still user needs to aware of them, thus it’s needed to check the mappings every time creating a cohort.

Obviously, some decision need to be done.

From the vocabulary user perspective the best decision is to add to the vocabulary SNOMED-like concepts (SNOMED Extension) that have exact match to a given source vocabulary.
In this case vocabulary user don’t need to use any additional queries and go through mapping peculiarities - just use the usual concept_relationship to go from the non-standard vocabulary concept to a standard one.

These SNOMED-like concepts will have hierarchical relationships as SNOMED concepts have. Thus using concept_ancestor the user will get the accurate cohort definition.
Let’s consider this according the cases listed above:

Case 1
now:
M99.03 “Segmental and somatic dysfunction of lumbar region” Maps to 203708004 “Segmental and somatic dysfunction”
SNOMED Extension way
M99.03 “Segmental and somatic dysfunction of lumbar region” Maps to SnExt1 “Segmental and somatic dysfunction of lumbar region”
this concept has 203708004 “Segmental and somatic dysfunction” as parental concept

Case 2
Now:
E10.321 "Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema Maps to 138881000119106 “Mild nonproliferative retinopathy due to type 1 diabetes mellitus”
E10.321 "Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema Maps to 312912001 “Diabetic macular edema”
SNOMED Extension way
E10.321 "Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema Maps to SnExt2 “Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy with macular edema
XXXX has 138881000119106 and 312912001 as parental concepts.

Case 3
E10.21 “Type 1 diabetes mellitus with diabetic nephropathy” Maps to SnExt3 “Type 1 diabetes mellitus with diabetic nephropathy” that have parents from both hierachies:
“Diabetes type 1” and “Disorder of Kidney”

@abedtash_hamed case


C34 Malignant neoplasm of bronchus and lung now is mapped to Malignant neoplasm of respiratory system (not specific)
so we’ll map to a SNOMED Extension equivalent which will have children
93734005 “Primary malignant neoplasm of bronchus”
and 93880001 “Primary malignant neoplasm of lung”

The first step - to build a hierarchical structure,
the second step is to allow the post-coordination that cannot be implemented in CDM itself
For example
now
C90.00 “Multiple myeloma not having achieved remission” Maps to 109989006 “Multiple myeloma”
so we lose the information about remission status
If we create new concept “Multiple myeloma not having achieved remission” it will have parental concept 109989006 “Multiple myeloma” and will have the link to a qualifier 03336008 “No mention of remission”

This approach looks quite time consuming, but it will solve all the condition mapping related problems.
Step 1 with hierarchy build is relatively easy as we already have mappings to a multiple concepts
also we can identify the cases when it’s mapped to a more general concept and build hierarchy.

Please let me know you thoughts and suggestions

2 Likes

Great idea! We love the concept of Extensions :smile:
Sure, it will be not only time-consuming but also rather complicated in terms of building a proper hierarchy. Still, it seems to be a step we must take. For instance, ICD10CM R34 ‘Anuria and oliguria’ currently is mapped to ‘Finding of urine output’ as we don’t have a decent classification concept in SNOMED. The latter has ‘Polyuria’ as a child, which may lead to extra-recruitment.
Another thing that we need to pay attention to is identifying the concepts that are missing in SNOMED taking into account
syntactic heterogeneity across SNOMED and ICDs. In other words, we need to make sure that we don’t have ‘Multiple myeloma not having achieved remission’ in SNOMED written in another way.

1 Like

Thank you. I like the idea of extensions too - as this will enhance precision.

Instead of creating snomed extensions, why don’t we make the nonstandard concepts that don’t have an equal mapping to snomed, a standard concept. Then we map the snomed code with closest meaning as either ancestor it descendant.

This same approach would work for NDC to RxNorm

We tried that in RxNorm. And suffered such a debacle, that we came around, jumped over our shadow, and decided that finally we will create our own OMOP vocabularies. Before that, it was a strong dogmatic no-no.

The problem is that we control neither the content of these Source Concepts, nor their life cycle. So, good idea, but didn’t work.

1 Like

Thanks. Just wanting to learn from your experience… how did the lifecycle of these external codes create problems

@Dymshyts:

Even though I am VERY careful with adding new vocabularies, my heart is slowly warming up with the SNOMED Extension. As long as we don’t really invent anything, but just combine existing SNOMED concepts for those situations where we lose information during mapping. From your examples, I can see how it can help some of the cases:

  • Your Case 1 really is a case where the combination of attributes of one disease is not pre-coordinated in SNOMED. Makes sense.
  • Your Case 2 really is a case where two conditions are not pre-coordinated in SNOMED. Not sure why mapping to two different SNOMEDs are not sufficient.
  • Your Case 3 really is a case where the mapping is fine, except the hierarchies don’t match, and therefore the Concepts are missing from the hierarchy. Isn’t that usually really the same as Case 2?

What about “Disease of the lumbar region” as another parent?

They do whatever they want: Come and go, change, have unintelligible concept_names, are ambiguous (two of them exist with the same meaning). Only a few vocabularies are really clean: RxNorm, SNOMED, LOINC. The rest is messy.

Hi @Dymshyts, excellent topic, I have a lot of thoughts on this, but I’ll try to keep them brief and focused:

I’m hoping we can abolish this case. I can count about a dozen times where we wanted to do a study, using the codes above, for example: ‘segmental and somatic dysfunction of head’ and when we do the code review with clinicians they ask why we are covering codes related to the lumbar region, and we have to make a work around for the vocabulary (usually using source codes, which is a nightmare). My understanding on what makes a concept ‘standard’ is that it is the unique idea across many vocabularies but was chosen as the primary avatar of the particular medical concept. If we are faced with a situation ‘we don’t have a specific concept for this, but we have another, broader concept we can map it to’ that means we don’t have a ‘unique concept that can serve as the primary avatar of the particular medical concept’, and we should find one, pick, it, mark it standard, and map to it. I am sorry I sound so heavy handed on this, and I understand that could entail a ton of work, but this sounds like exactly what SNOMED ext is doing, so we should absolutely fold it in.

This is exactly why the OMOP needs an OMOP vocabulary that governs all unique medial concepts that exist. I understand that’s a Herculean effort and we don’t have the resources for that, but as long as we lean on someone else’s work that has no consideration about the current standards of the CDM, we’re always going to be at risk of some kind of failure when the source vocabularies make some seemingly arbitrary change. But this is another topic entirely, and I digress.

I’m with @Christian_Reich here: combo codes get 2 maps and 2 records in the CDM. From your example, you could be looking for Type 1 Diabetes mellitus, you could be looking for macular edema. you should not have to look for both together to find the person that was coded this way. However if you do want to find both together, you’d write your query to say you want to find those conditions together (either same day, or same visit). Is there any difference from a person that went into one room and the clinician says ‘you have type 1!’ and then shuttle that person to another department where they say ‘You have macular edema!’, vs. the person who went into an office and had someone tell them ‘You have T1DM with macular edema!’? First case was 2 visists, second case was 1 visit, but don’t they have both? That’s why I wouldn’t conflate a complicated cohort definition with organizing the vocabulary. Let the vocabulary identify the ideas, and let the researcher decide if the coincidence of these things is important to their scientific question.

I’ve run into this before, and I’ve come to appreciate this element of SNOMED: They focus on one thing at a time (usually)…The renal disorder related to T2DM is not a diagnosis of T2DM (stay with me on this, I’ll use another example that might be clearer). Of course once you have T2DM you are stuck with it, so it makes sense that if you say a person with ‘Renal disorder resulting from the T2DM condition’ you can say this is a T2DM person. Consider another case: A disorder that remains after the causing disease is gone. Let’s consider the example of a diagnosis of ‘Cirrhosis from HepC’. Does this person have HepC as of the finding of the liver damage? Not necessarily. The person could be Hep C clean but much later identify the damage to the liver. But the snomed concept of ‘liver damage from hepc’ isn’t about identifying HepC conditions, it’s about identifying the liver damage. So, Case 3 is a non-case. It’s up to the research to decide if liver damage from hepC means ‘this person is currently suffering from HepC in teh body’. And if that’s what the research wants, they should look for direct HepC diagnosis and any disorders from hep C. if they only want the direct diagnosis of HepC, just look for that parent concept of HepC diagnosis. Again, I would avoid concerns with a complicated cohort definition when determining Vocabulary structure. Let the vocabulary just specify the standard ‘ideas’ that we can capture in the CDM. Thinking about the analytical use cases leads to some very messy situations.

-Chris

2 Likes

I should add: I think the above mapping is in error (if it only maps to the Renal Disorder associated with T1DM). I think it’s clear that E10.21 is saying you are both T1DM and have diabetic nepropathy I feel this is a combo case and should be covered under case #2.

Makes sense, and so we call them “dirty”. But this proposal is for ontologies where there is no equal mapping from dirty to clean vocabulary, especially case 1 example above - where we loose information with current mapping conventions. The solution to case 1 offered above, is copying the concept name of an ICD diagnosis that does not have an equal snomed concept. Creating a snomed-extension with concept name of the copied ICD concept. Both are fine, I thought, retaining unmapped ICD, making it standard, mapping snomed as it hierarchy would be easier.

Thanks @Chris_Knoll for a great review. It’s a pleasure when somebody shares your thoughts.
I want to clarify:
Case 2. You say

and

that means, that we need to make this case as SNOMED Extension? (while Christian says, we shouldn’t)

Case 3.
Probably we can change the SNOMED hierarchy, SNOMED has this “Due to” relationship. So we can use it as a hierarchical relationship. but we need to be aware of ‘Cirrhosis from HepC’ case. So we need to check

@Christian_Reich,
according to the case 2.

what about the such situation:
S82.225K Nondisplaced transverse fracture of shaft of left tibia, subsequent encounter for closed fracture with nonunion Maps to 28012007 Closed fracture of shaft of tibia 436252
S82.225K Nondisplaced transverse fracture of shaft of left tibia, subsequent encounter for closed fracture with nonunion Maps to 302941001 Nonunion of fracture 73574

If the person had multiple fractures and they were revisited within the same visit, we never know which fracture had Nonunion.

The diabetes case is fine, if the users or Atlas will modify their queries.
But if we already implementing SNOMED Extension, it would be much easier to make it on a vocabulary level rather then CDM level.

Here is another good example whee SNOMED Extension is needed:
ICD9CM concepts
678.0 Fetal hematologic conditions
678.00 Fetal hematologic conditions, unspecified as to episode of care or not applicable
678.01 Fetal hematologic conditions, delivered, with or without mention of antepartum condition
678.03 Fetal hematologic conditions, antepartum condition or complication

These all are currently mapped to SNOMED’s “Fetal anemia”. However, American Haemotology Society states that term hematologic conditions includes “fetal anemia, fetal bleeding disorders …, fetal blood clots, and fetal blood cancers”, all four of which are present in SNOMED in one form or another, but do not have a common parent that we can reasonably use by itself or in conjunction with 70591005 “Fetal disorder”.

So we can create SNOMED Extension concept “Fetal hematologic conditions” that will have existing SNOMED concepts for “fetal anemia”, “fetal bleeding disorder”, “fetal blood clots” and “fetal blood cancers” as child concepts.
and “fetal disorders” will be it’s ancestor concept.

@Christian_Reich, thus SNOMED Extension solves the chronic problem with “AND”, “OR” concepts.

Another idea that can help with hierarchy mismatch, i.e. Case 3
is to make “Has due to” and “Finding asso with” SNOMED relationship_id hierarchical.
SNOMED itself doesn’t make hierarchy from these pairs, but the pairs looks like hierarchical, look:

“Diabetic oculopathy” Has due to “Diabetes mellitus”

while sometimes it doesn’t work


there are consequenses of Burn while Burn itself could happened a long before.

Well, we may define a list of chronic conditions: diabetes, congenital states, AIDS, etc. and build such a hierarchical relationships for them only.

The decision is kinda questionable: to modify the official vocabulary based on the manually filtered list of the concepts.
@Christian_Reich, looks like we need to request those changes in SNOMED first.

Hi, @Dymshyts, this is excellent research. I was going to comment on the problems with concepts that identify a diagnosis but then also include some causal event that the condition is referred to. The prior example you gave where you had diagnosis of X as a result of Y (ie: Nondisplaced transverse fracture of shaft of left tibia, subsequent encounter for closed fracture with nonunion Maps to 28012007 Closed fracture of shaft of tibia), the event being identified is X (the nondisplased transverse fracture), and it’s from a prior closed fracture with nonunion (I’m assumign that’s what 'subsequent encounter for… means).

I this case, I think it’s an error to have a mapping to the fracture of tibia and a closed fracture, because the prior closed fracture happened in the past. Just record that there’s a fracture of tibia.

However, if we can create our own hierarchy inside a snomed extention: I’d say that Nondisplaced transverse fracture of shaft of left tibia, subsequent encounter for closed fracture with nonunion falls somewhere below ‘fracture of tibia’ (something like fracture of tibia-> fracture of shaft of left tibia’ but also you can find a concept under ‘fractures resulting from closed fracture’ -> fracture of shaft of left tibia, subsequent encounter for closed fracture’, etc.

But, would this explode the concept hierarchy with all these possible multi-parent relationships and including all of these X due to Y concepts? But I do feel strong that X due to Y shoudln’t result in both a record of X and Y on the same day, simply because we don’t have the information that Y occurred simultaneously. I would expect somewhere else in the medical history you’d see Y as the primary event.

The main use case for putting these types of X because of Y in a hierarchy is that commonly you’d use the vocabulary to find things like ‘skin blisters’ but not ‘skin blisters from burns’ so you’d make a concept set expression saying ‘give me skin blisters and descendants, excluding skin blisters from burns’

-Chris

1 Like

Well, this is still a fracture, even if the trauma happened a long time ago.

The same approach is used to another group of concepts - “with delayed healing”, like this one:
“Nondisplaced transverse fracture of shaft of left tibia, subsequent encounter for closed fracture with delayed healing”

I like the idea of hierarchy - it’s actually what I’m suggesting.

Another example where we might need SNOMED Extension
ICD9CM concept:
E871.4 Foreign object left in body during endoscopic examination
now is mapped in this way (we are making manual review of mappings now)
Maps to 74402000 Foreign body accidentally left during a procedure (domain_id = ‘Condition’)
Maps to value 363071007 Diagnostic endoscopy

so it will not work, because we don’t have value_as_concept_id column in condition_occurrence table. So during CDM conversion we lose the information about 363071007 Diagnostic endoscopy. So now we put this Maps to value only as the vocabulary reference.
Of course instead of putting Maps to value, we can add Maps to a history of Endoscopy, but in this case we lose the connection between Diagnostic endoscopy and Foreign body.

Thus we need to make SNOMED Extension concept “Foreign object left in body during endoscopic examination”
that will have at least two relationships:
Is a “Foreign body accidentally left during a procedure”
Has Due to “Endoscopic examination”.

This is a case where it’s a combo: the foreign body was left at the same time as the exam, so 2 records can be produced ‘foreign body left in body’ and the ‘diagnostic endoscopy’. I’d argue that the foreign body was left at the time of the diagnostic endoscopy, and not when it was found (which is probably the date that you’ll get when you see the E871.4 code).

Making a hierarchy solves this problem that could otherwise be solved by creating fact-relationships that relates the foreign object left in body with the diagnostic endoscopy. Is there any concern with creating a hierarchy of all possible causes, or would this hierarchy be curated by the prevalence of these linked-concepts we find in real world evidence?

who knows, maybe they left something, and found out only the next day.

hm. we can define the rule - if the concept has several Maps to records, fact_relationship is created

Thanks @Dymshyts for starting this discussion. I’m totally with you to have SNOMED Extension. Otherwise, you’ll miss significant number of records (ultimately patients) in the cohort and analysis if standard concepts (SNOMED) are the starting point to prepare code lists. I’m seeing the effect in few of our studies.

t