There’s a lot about OHDSI vocabularies I don’t understand yet, so it’s hard for me to know if any given reservation or question about my proposal is because 1) I haven’t explained myself clearly enough, 2) I’ve misunderstood something and need to tweak the proposal, or 3) I’ve misunderstood something big and some basic aspect of the proposal is impossible or infeasible.
I’ve just had an offline conversation with @Christian_Reich and it’s still not clear to me whether I’m not getting the real complexities and need for the current structures or he’s not getting how my proposal could cut through a lot of those.
First, regarding @hripcsa’s concern: @Christian_Reich confirmed that most of the intra-vocabulary relationship table records are derived directly from the vocabularies, so it might be possible to include intra-vocabulary relationships from non-standard vocabularies without a huge burden.
So, the big and most dubious implication of my proposal is basically that the concept_ancestor table is not used in the navigation UI and becomes unnecessary. @Christian_Reich says we can’t leave users to navigate the (single-step) relationship table relationships because they’ll never be able to find their way around; the concept_ancestor table is necessary in order to navigate the real, cross-vocabulary relationships that are meaningful to users; only people intimately familiar with the vocabularies would be able to make sense of a navigation topology consisting only of direct parent-child or sibling relationships.
(TL;DR – you can skip reading the indented section here if time is short; the more important points are below.)
He brought up some particularly troublesome cases, like (I think) VA Product to ATC Class, which would require three lateral jumps, making it basically an unnavigable path for a non-informaticist user.
We also talked about the path from SPL to RxNorm Drug Product where the downward path through ingredient can lead to many Drug Products that are not actually related to the SPL, so the two-step link supplied in concept_ancestor is necessary.
My contentions, I realize, will be a hard sell to people who have much more experience with all this than I do and who have solved many of the inter-vocabulary navigation problems using the inferred relationships generated for the concept_ancestor table – but I will lay them out.
I don’t know if this captures the issue @Christian_Reich brought up with VA Product, but I’ll use the example of VA Product EMPAGLIFLOZIN 10MG TAB: http://www.ohdsi.org/web/atlas/#/concept/45777555. My first objection to the current setup with this example is that there is no information on the Hierarchy tab because this is not a standard concept. The results on the Related Concepts tab, though, strike me as particularly weird and not all that useful:
The first item, ORAL HYPOGLYCEMIC AGENTS,ORAL is a VA Class, clearly an ancestor of some sort, but there’s no indication here of what the path is between the drug we’re looking at and this class.
All of the other items have 0 for Record Count and Descendant Record Count. Why is that? Is this a drug that doesn’t appear in the data? Why are there no ATC classes on this list? Why are these particular concepts showing up on the Related Concepts list and not a bunch of others that could be related in one way or another?
Since the only path available to me from here that seems like it would lead to actual records in the data is going up to ORAL HYPOGLYCEMIC AGENTS,ORAL, but that ends up confusing me more. Now I see the first page of 13.605 entries. Whatever the 4,456 DRC was referring to seems to have nothing to do with this higher-level page… Ok, I’m going to stop with the blow-by-blow as I stumble around trying to find a path from a VA Product to the most relevant ATC class and to the most relevant standard concepts actually tied to records in the database.
What I’d like to see is:
- Any actual VA drug hierarchy and how its levels are tied to the concept in question. (This might not be currently possible, though, because VA Product and VA Class are not just different classes, they’re different vocabularies–which makes me think (especially after searching around the web and finding this) that the VA stuff in OMOP is not from a coherent vocabulary but was maybe cobbled together from references in FDB or something.)
- Where this concept ties to concepts in other vocabularies and the local neighborhoods–insular to each vocabulary–of those concepts.
I’ve started doing some thinking about @Christian_Reich’s proposal for “an overview topological view where we are (like a dot on the US map)”, and think it can maybe be done in a good way–the VA concept in question and the other-vocabulary concepts it links to could all be highlighted on this topological map so the user would have a better sense of what links are worth following in order to get to a desired neighborhood.
Anyway, I know I have a lot more convincing to do before anyone will believe me (and I don’t know yet if I’m right), but my conjecture is that the inferred relationships in the concept_ancestor table make it harder, not easier, to build a clear, intuitive vocabulary navigation UI and that the effort to discourage use of poor vocabularies by only including their relationships to standard concepts sacrifices information important to users. If a user is looking at concepts in some bad vocabulary (e.g., ICD9), I suspect:
- They have source data using that vocabulary.
- That vocabulary may be bad, but it probably has some internal logic.
- The user may have a better understanding of that vocabulary than of standard vocabularies.
To be clearer about the implications of what I’m proposing, I think it would involve adopting rules like:
-
Represent the topology of each source vocabulary as accurately and completely as possible in the relationship table.
-
Mapping from source concepts to standard concepts as much as possible should:
a. use external resources like UMLS or FDB
b. add or maintain OHDSI-originated mappings only where necessary
c. map only to synonymous or sibling concepts except when cross-granularity or cross-semantic mappings are the only way to capture important information
-
(Probably) don’t include links from one non-standard concept to another in a different vocabulary unless that’s the only way to establish a path from that concept to a standard concept.
In the case of SPL to RxNorm Ingredient and SPL to RxNorm Drug Product, both should be captured in the relationship table. An SPL represents both a set of ingredients and a set of products; but it doesn’t represent all products that contain that set of ingredients. So, from the point of view of RxNorm, product and ingredients can have a child-parent relationship, but RxNorm product is not a grandchild of SPL.
I’m sure that’s more than enough for now. Sorry to go on for so long.