OHDSI Home | Forums | Wiki | Github

Proposed changes in SNOMED domains

This post follows on the recent discussion on SNOMED overhaul taking place in the Vocabulary WG last Tuesday, slides, recording and meeting notes available in Teams.

Given multiple debates over Condition vs Observation vs Measurement in the community, the Vocabulary Team has proposed to use SNOMED tags to assign domains in these gray areas.

Here’s a snapshot of the presentation.
For example, both Hemoglobin low and Anemia were Conditions. In proposal, we use tags to assign them different concept_class_id, which leads to Anemia staying in Condition domain and Hemoglobin low moving to Measurement domain (and being post-coordinated).

Attached snomed_domain_delta.xlsx (1.2 MB)
is the table with full delta (old domains and new proposed domains). Please check it out and comment here/come to the Vocab WG next Tuesday for further discussion.

Tagging some of the folks participating in the conversation on the last call:
@MPhilofsky, @Andy_Kanter, @DaveraG, @hspence. Please share with others! :slight_smile:


@aostropolets Great work!

I came across a topical discovery. A lot of Observations from the ‘Clinical history/examination observable’ SNOMED branch hold the potential to be classified as Measurements. Prior to this, I was particularly focused on the child concept ‘Near visual acuity’ for mapping purposes. Would it be possible for you to check if your automated approach can also effectively cover this hierarchical branch?

1 Like

Thanks so much!! We will check it out :slight_smile:

So to be clear, the concern that I had was that by differentiating between findings and diseases that we don’t break subsumption queries… if clinical staff or prior ETL assume things to be synonyms then they would expect the subsumption queries to return both. This might be a bad example but if we have a finding “positive gonorrhea test” and from a phenotype perspective this might mean the patient has gonorrhea. The current parents of this concept are positive test with no relationship to gonorrhea. If this change doesn’t change existing hierarchies, then perhaps there is no problem. But if previously low hemoglobin would have been captured when people were looking for anemia, then by formally separating the hierarchies could cause problems. @aostropolets you implied that subsumption won’t change.

Right. This change will not affect the SNOMED’s hierarchy, but you’d need to query more clinical event tables in some cases

This is a good question, actually. On one hand, changing the SNOMED hierarchy would be suicidal. On the other hand, if hierarchies cross domain boundaries than they violate the idea of a hierarchy, in which descendants inherit all the properties of the parent. The problem we are causing is that our domain is an attribute SNOMED is oblivious of. Now what?

Exactly here. I don’t know what @Andy_Kanter has in mind, but let’s say gonorrhea is the parent of the positive test. Then a query of gonorrhea and all its children will correctly bring back the test. However, it is in a different domain, and then the cohort will not miraculously contain both Condition and Measurement records. In other words, preserving the hierarchy even if it goes cross domain boundaries doesn’t help us in practice.

I don’t have a good idea. An idea that doesn’t sound so good would be to break the hierarchy at the domain, and stitch the individual pieces together. Urgh.

@Polina_Talapova, hello!

Thank you for your comment.
Indeed, Observable Entity concept class is a sophisticated mixture of measurements and observations. We have covered the mentioned branch with hierarchical peaks and assigned Measurement domain where needed. Specifically, Visual acuity and its descendants will be moved to the Measurement domain.

1 Like

Greetings! Appreciate your excellent work. I’ve reviewed the list of concepts with updated domains and would like to share our findings.
As a new user on this platform, I’m unable to upload files. Could you guide me to a location where I can upload the table with results?

Many thanks for looking it! :slight_smile: file restrictions are annoying - could you try sending it in a message? Or, alternatively, to my email, which is ostropolets@ohdsi.org.

Let’s play a game, let’s try to guess the domain without looking in Athena:
SNOMED code | name
39579001 | Anaphylaxis
13791008 | Asthenia
271737000 | Anemia
718938003 | Somatic dysfunction of thoracic region
718936004 | Somatic dysfunction of sacral region
53854005 | Chorioretinal scar
40930008 | Hypothyroidism

Anyway, I’ll go straight to the point, please see the table attached, that has 20 most frequent concepts in JnJ network with concept_class of Clinical Finding or Disorder and domain_id of Condition or Observation, resulting in 4 columns.
domain distribution Condition vs Observation.xlsx (10.2 KB)

When I heard about using SNOMED tags (reflected in concept_classes in OHDSI vocabularies), I thought that Clinical Finding becomes Observation, and Disorder becomes Condition.

But as far as I understand there are some historical decisions such as Pain is Condition (while a lot of other symptoms are Observations), or scars are Observations (while it’s concrete change in the body), and so on as you can see in the table

So, my ultimate proposal is to put all these terms under ‘Condition’ domain.
This will close this (10-years long?) discussion and give users some peace of mind:)


We should discuss this. Obviously, it is not that simple, as many terms are clearly not conditions (Postoperative state, Patient encounter status etc.), but the damage might be less than trying to do the smart picking. The third alternative is to fix SNOMED, but that is slow and burdensome.

Something for the Vocab Workgroup?

1 Like

Hahaha. Nice move!

Ok… Let me tell you the correct ones…
Sorry, can’t do that. The concept names are not self-sufficient.


I would say that domain assignment is the trickiest process in SNOMED support. We would happily rely on semantic tags or large hierarchical branches, but we cannot.

The Clinical Finding concept class is the most sophisticated mixture of various semantics (pre-coordinated measurements, conditions, observations). For example, findings of body mass index are definitely not Conditions (but the obesity is), I would say they’re rather Measurements. Also, it contains concepts that represent e.g. bacterial colony surface appearance, various administrative statuses (with Requires vaccination and
Item held as scanned document among them), findings of demografic history, and many more concepts that could not be Conditions anyway.

We have separated the Disorders and assume all of them should be Conditions. However, concepts that carry the semantics of allergic reactions should be observations as there should be an option to store causative agents as values for them.
Despite the hierarchical branch of scars being assigned the Disorder semantic tag and, respectively, concept class, scarring itself is a natural consequence of wound healing, and it’s not a Condition, it’s an Observation.

We must admit that there’s no simple solution to this problem. We are constantly improving our vocabulary logic to achieve more preciseness in the domain assignment and we’ve done a lot. To my opinion, there’s no need to make revolutionary decisions here.

1 Like

I have no expertise to speak on this topic, but it’s very interesting but:

I think in this case I’d expect (as a user of the vocab) that scarring is a state of a body so I’d find it in the condition table. There are some diagnosis (like pulmonary fibrosis) that are described as scaring of lung…is the scarring in this case a condition because of the context (from healing vs. from an environmental toxin) or that this specific type of scaring has a clinical name (the fibrosis)?

I’ve rarely used Observation period when determining health status of a person, the observation table is described as:

The OBSERVATION table captures clinical facts about a Person obtained in the context of examination, questioning or a procedure. Any data that cannot be represented by any other domains, such as social and lifestyle facts, medical history, family history, etc. are recorded here.

This leaves the observation table as a sort of ‘catch all’ where some fact doesn’t fit one of the other domains…

condition_occurrence is:

This table contains records of Events of a Person suggesting the presence of a disease or medical condition stated as a diagnosis, a sign, or a symptom, which is either observed by a Provider or reported by the patient.

So if ‘scarring’ is considered a medical condition that is diagnosed or a symptom (is scarring a sign or symptom of healing?)…i’d put it in the condition table. Just my 2 cents.

1 Like

Thank you for the comment.
I agree that in many cases reactive formation of connective tissue may indicate a pathologic process and, in OMOP sense, these events should be stored in the Condition table.
Fortunately, SNOMED refers these concepts to the specific organ disorders, e.g. pulmonary fibrosis belongs to the hierarchy of lung disorders, hepatic fibrosis belongs to the hierarchy of liver disorders and so on. At the moment all of them belong to the Condition domain.
On the other hand, the hierarchy of scars includes the concepts that specifically represent the result of wound healing. And these concepts are Observations.

The main idea here that we strongly need to avoid subjective reasoning.
Scars is a very good example. For me Scars are conditions. And we can argue with @m-khitrun and both of us will find valid arguments based on our clinical knowledge.
The problem occurs when I need patients with particular symptoms or disorders having these symptoms. Then I always need to look in both tables.
The proposal is to put everything in Condition with some exceptions.
If we’ll get false positive Conditions like Item held as scanned document or findings of demografic history , nobody cares.

The problem will only exist when you need to store values.
In this case I would list all concepts of this kind (allergic reaction to, requires vaccination of, etc. - we can discuss this list), so users can see it instead of guessing.
And use similar approach with potential measurements, where you already started postcoordination. I’m not sure what to do with Obesity example, as it has values as intervals, such as Child body mass index 92nd-97th centile.

So there will be a simple rule:
if it’s symptom or state, look in Condition domain
if it looks like postcoordination, check with the list of postcoordinated concepts and look in Measurement or Observation.

We are not under any obligation to align OMOP Vocab domains to SNOMED’s hierarchy tags. Tags are important to provide reference back to the source SNOMED structure and context, but there is no strict rule that would guarantee that everything SNOMED tags “disorder” or “clinical finding” should correspond to “Condition” and “Observation” in our Vocabs. At the very least, SNOMED tags are hierarchical: if something is a disorder, it is also always guaranteed to be a clinical finding; in OMOP, Conditions and Observations are semantic siblings.

The problem is internal: there is no single methodology for domain classification. The Book of OHDSI claims that domain assignment should be heuristic, and even refers back to the vocabularies code base:

Domain assignments are an OMOP-specific feature done during vocabulary ingestion using a heuristic laid out in Pallas.

IMHO, leading users to look for answers in code rather than design documentation is bad design. This creates potential for circular reasoning, e.g. our code classifies scars as Observations, because book of OHDSI says to look in the code for answers, and the code made them Observations.

Of course, domain assignment is a very difficult task. Span of things humans do, interact with and conceptualize in clinical practice is immense, and there never will be a concise and logical algorithm to categorize concepts. Sure, most times a diagnosis is made, it is unambiguously a Condition, and when a substance is injected, it is likely a Drug. But there always will be ambiguity around symptoms and syndromes vs. diseases, biological vs. pharmaceutical drugs and measurement vs. observations, and many other edge cases. Nevertheless, having at least somewhat formalized Domain definition page at Vocabulary wiki would lay down important groundwork; and there are precedent systems to enable pin-point documentation of precise edge cases.

For nearest example, SNOMED has short template pages defining how the new concepts in particular domains should be modelled; we could have the same for domain assignment in edge cases. So any time there is an argument about scar tissue formations being Condition or Observation, or for IV contrast being Drug or Device we could have a THEMIS-like convention for reference. There is nothing wrong with subjective reasoning (we are operating with made-up high-level abstractions over physical processes and ephemeral emerging properties of biological systems), nor with justifying decisions by precedent, but as circle of Vocabulary authors will grow, documentation system (not the codebase, that operates on mutable data and is changed arbitrarily, and not the search on GitHub issues or this forum) to support domain assignment decision is required for consistency.

And thanks to SNOMED’s multiaxial hierarchy, these conventions would also be magically easy to implement for most of domains, as there can be arbitrary number of rule definitions in OMOP SNOMED representation.

Taging @MPhilofsky as this looks very much in line with ongoing April Olympiad.

1 Like

I’ve been watching the conversation. And as @Eduard_Korchmar points out, arguments from both sides are convincing. And as @Christian_Reich and @m-khitrun state, there isn’t a simple solution. And I agree this is something for the Vocab WG to discuss. HOWEVER, before a change is made, I believe it is very important to gather feedback from the community. The CPT4 change continues to have lasting effects. Community members are NOT updating their Vocabularies because they are worried it will break their ETL and records will be lost. The OHDSI community is very diverse and large. A holistic approach is necessary. And as the solutions are debated, we need to keep in mind many/most community members are not on the forums or GitHub.

And whatever the decision, it must be well documented:

Yes. I don’t think we need anything more than this: SNOMED’s structuring of conditions vs. observations is probably the tip of the iceberg of logic and reasoning that lends nicely to the other design choices around SNOMED…but we’re not SNOMED, we’re OMOP CDM, so we make our own statements about what Conditions vs. Observations are (and it is documented, see above citations from the CDM documentation) so to overlay SNOMED reasoning on top of that where we don’t follow any of the other reasoning that SNOMED has (such as how they coordinate different SNOMED concept-atoms into a single clinical concept example: Multiple snomed codes : if the internal code is “First Consultation of Gynecology” this would result in 3 SNOMED codes First (qualifier value) / Consultation (procedure) / Gynecology (qualifier value)), this would be a round peg -square hole situation. I’m just not convinced that the CDM notion of an ‘Observation’ overlays what SNOMED thinks an ‘Observation’ is.