We are not under any obligation to align OMOP Vocab domains to SNOMED’s hierarchy tags. Tags are important to provide reference back to the source SNOMED structure and context, but there is no strict rule that would guarantee that everything SNOMED tags “disorder” or “clinical finding” should correspond to “Condition” and “Observation” in our Vocabs. At the very least, SNOMED tags are hierarchical: if something is a disorder, it is also always guaranteed to be a clinical finding; in OMOP, Conditions and Observations are semantic siblings.
The problem is internal: there is no single methodology for domain classification. The Book of OHDSI claims that domain assignment should be heuristic, and even refers back to the vocabularies code base:
Domain assignments are an OMOP-specific feature done during vocabulary ingestion using a heuristic laid out in Pallas.
IMHO, leading users to look for answers in code rather than design documentation is bad design. This creates potential for circular reasoning, e.g. our code classifies scars as Observations, because book of OHDSI says to look in the code for answers, and the code made them Observations.
Of course, domain assignment is a very difficult task. Span of things humans do, interact with and conceptualize in clinical practice is immense, and there never will be a concise and logical algorithm to categorize concepts. Sure, most times a diagnosis is made, it is unambiguously a Condition, and when a substance is injected, it is likely a Drug. But there always will be ambiguity around symptoms and syndromes vs. diseases, biological vs. pharmaceutical drugs and measurement vs. observations, and many other edge cases. Nevertheless, having at least somewhat formalized Domain definition page at Vocabulary wiki would lay down important groundwork; and there are precedent systems to enable pin-point documentation of precise edge cases.
For nearest example, SNOMED has short template pages defining how the new concepts in particular domains should be modelled; we could have the same for domain assignment in edge cases. So any time there is an argument about scar tissue formations being Condition or Observation, or for IV contrast being Drug or Device we could have a THEMIS-like convention for reference. There is nothing wrong with subjective reasoning (we are operating with made-up high-level abstractions over physical processes and ephemeral emerging properties of biological systems), nor with justifying decisions by precedent, but as circle of Vocabulary authors will grow, documentation system (not the codebase, that operates on mutable data and is changed arbitrarily, and not the search on GitHub issues or this forum) to support domain assignment decision is required for consistency.
And thanks to SNOMED’s multiaxial hierarchy, these conventions would also be magically easy to implement for most of domains, as there can be arbitrary number of rule definitions in OMOP SNOMED representation.
Taging @MPhilofsky as this looks very much in line with ongoing April Olympiad.