OHDSI Home | Forums | Wiki | Github

Vocabulary hierarchy exploration via Atlas

Hi,

This topic could go in a few forum categories but I’m placing it under Vocabulary Users since I think it is the most applicable.

When viewing a concept in Atlas (for example, metformin: http://www.ohdsi.org/web/atlas/#/concept/1503297), you have the ability to select the “hierarchy” tab which will then show parents and children of the selected concept. Currently, this is done through configuration code in Atlas so that the hierarchy is only produced for those vocabularies and classes that are explicitly configured.

I had received a request (https://github.com/OHDSI/Atlas/issues/215) to add in the parent/child relationships for ATC categories. In the course of having this change reviewed, we decided that this was something that should be discussed in this forum to get feedback on the approach in general.

The “hierarchy” tab is attempting to satisfy two use cases which I would like to separate out. The first use case is to show first order ancestor (parent) and first order descendant (child) concepts as modeled in the vocabulary. For standard concepts, these relationships are modeled in the vocabulary table: concept_ancestor. When these ancestral relationships are exposed through Atlas, they are labeled with a relationship of “Has ancestor of” and “Has descendant of” respectively which you can explore via the “Related Concepts” tab on a selected concept. Since these relationships are already modeled in the vocabulary, I’d propose that the “hierarchy” tab for all concepts should show the 1st order (min_levels_of_separation == 1) ancestors & descendants. From the Atlas side, this would simplify the approach that is currently implemented and standardize it across all concepts. Are there any problems/concerns with this approach? Other thoughts on this use case and how it might manifest in Atlas: For non-standard concepts, this section would be empty so we might consider hiding it all together. Another alternative: we could provide the parents/children as an additional filter section on the “Related Concepts” screen and potentially remove the hierarchy tab.

The second use case builds on some of the discussion from this thread: Lack of ICD9 Hierarchy. That is: at times, users may want to explore the “insular” hierarchy for a given concept. For example, if I were exploring the ICD9 code 410.21 (http://www.ohdsi.org/web/atlas/#/concept/44819700), it would be useful to see the relationships within ICD9 for that concept. That being:

• 410 - Acute myocardial infarction
	○ 410.2  Acute myocardial infarction, of inferolateral wall
		§ 410.21 Acute myocardial infarction of inferolateral wall, initial episode of care

Again, there is support for that use case in our shop since some feel more comfortable initially exploring these concepts in a source vocabulary before moving over to standard concepts like SNOMED. The good news is that Atlas already handles this use case in its current configuration approach by leveraging the entries in the concept_relationship table to model an “insular hierarchy”. The bad news is that the configuration is inside the bowels of Atlas instead of in the vocabulary itself. How do others feel about this use case? Could these be considered for their own concept_relationship entries to provide the “insular hierarchy”? If we were to model this in Atlas , I would want to make sure we clearly delineate between the insular hierarchy vs the standard hierarchy since the former would be for exploring source codes while the later should be used for creating analytics.

Appreciate the communities thoughts on this. And of course, please do correct anything that I may have misrepresented here :slight_smile:

@anthonysena

Have you detected any issues with using CONCEPT_ANCESTOR and min_level_of_separation=1? There might still be some deviations in the drug classes, or in other cases where the hierarchy switches vocabulary.

You could use the ‘Is a’ (or ‘Subsumes’) relationship for that, and it will just work:

select c1.concept_id as c1_id, c1.concept_name as c1_name, c1.vocabulary_id as c1_vocab, c1.domain_id as c1_domain, c1.concept_class_id as c1_class, c1.concept_code as c1_code,
  r.relationship_id as rel, r.invalid_reason as r_ir, 
  c2.concept_id as c2_id, c2.concept_name as c2_name, c2.vocabulary_id as c2_vocab, c2.domain_id as c2_domain, c2.concept_class_id as c2_class, c2.concept_code as c2_code
from concept c1
join concept_relationship r on r.concept_id_1=c1.concept_id and r.invalid_reason is null
join concept c2 on c2.concept_id=r.concept_id_2
where c1.concept_code='410.21'
and r.relationship_id='Is a'

Only problem here: These records don’t only give you the first-generation relationships, but the entire family (kind of mimicking the CONCEPT_ANCESTOR, though that one utilizes more than just ‘Is a’). We could fix that, if folks don’t object.

@Christian_Reich: I have not personally detected issues with using CONCEPT_ANCESTOR and min_level_of_separation = 1. I will test this out on my side to see if I find anything problematic. Furthermore, I think using the ‘Is a’ and ‘Subsumes’ relationship would work well for navigating within an insular vocabulary. Here is the query that is used in WebAPI to return related concepts:

DECLARE @id int
SET @id = 44819698;

select distinct * from (
    select c.CONCEPT_ID, CONCEPT_NAME, ISNULL(STANDARD_CONCEPT,'N') STANDARD_CONCEPT, ISNULL(c.INVALID_REASON,'V') INVALID_REASON, CONCEPT_CODE, CONCEPT_CLASS_ID, DOMAIN_ID, c.VOCABULARY_ID, RELATIONSHIP_NAME, 1 RELATIONSHIP_DISTANCE
    from CONCEPT_RELATIONSHIP cr 
    join CONCEPT c on cr.CONCEPT_ID_2 = c.CONCEPT_ID 
    join RELATIONSHIP r on cr.RELATIONSHIP_ID = r.RELATIONSHIP_ID 
    where cr.CONCEPT_ID_1 = @id and cr.INVALID_REASON IS NULL
    union 
    select ANCESTOR_CONCEPT_ID, CONCEPT_NAME, ISNULL(STANDARD_CONCEPT,'N') STANDARD_CONCEPT, ISNULL(c.INVALID_REASON,'V') INVALID_REASON, CONCEPT_CODE, CONCEPT_CLASS_ID, DOMAIN_ID, c.VOCABULARY_ID, 'Has ancestor of' , MIN_LEVELS_OF_SEPARATION RELATIONSHIP_DISTANCE  
    from CONCEPT_ANCESTOR ca 
    join CONCEPT c on c.CONCEPT_ID = ca.ANCESTOR_CONCEPT_ID 
    where DESCENDANT_CONCEPT_ID = @id
    and ANCESTOR_CONCEPT_ID <> @id
    union 
    select DESCENDANT_CONCEPT_ID, CONCEPT_NAME, ISNULL(STANDARD_CONCEPT,'N') STANDARD_CONCEPT, ISNULL(c.INVALID_REASON,'V') INVALID_REASON, CONCEPT_CODE, CONCEPT_CLASS_ID, DOMAIN_ID, c.VOCABULARY_ID, 'Has descendant of' , MIN_LEVELS_OF_SEPARATION RELATIONSHIP_DISTANCE  
    from CONCEPT_ANCESTOR ca 
    join CONCEPT c on c.CONCEPT_ID = ca.DESCENDANT_CONCEPT_ID 
    where ANCESTOR_CONCEPT_ID = @id
    and DESCENDANT_CONCEPT_ID <> @id
    union
    select distinct c3.CONCEPT_ID, c3.CONCEPT_NAME,isnull(c3.standard_concept,'N') STANDARD_CONCEPT, ISNULL(c3.INVALID_REASON,'V') INVALID_REASON, c3.CONCEPT_CODE, c3.CONCEPT_CLASS_ID, c3.DOMAIN_ID, c3.VOCABULARY_ID, 'Has relation to descendant of : ' + RELATIONSHIP_NAME RELATIONSHIP_NAME, min_levels_of_separation RELATIONSHIP_DISTANCE
    from (
            select * from concept where concept_id = @id
    ) c1
    join concept_ancestor ca1 on c1.concept_id = ca1.ancestor_concept_id
    join concept_relationship cr1 on ca1.descendant_concept_id = cr1.concept_id_2 and cr1.relationship_id = 'Maps to' and cr1.invalid_reason IS NULL
    join relationship r on r.relationship_id = cr1.relationship_id
    join concept c3 on cr1.concept_id_1 = c3.concept_id
) ALL_RELATED 
order by RELATIONSHIP_DISTANCE ASC

For a selected concept_id (@id in the query above), we basically give you back the kitchen sink in terms of a concept and its relationships (both in terms of concept_ancestor as well as concept_relationship). So, applying a consistent set of criteria for standard & non-standard concepts would be easy enough to do.

I’d propose we use “Has ancestor of” and “Has descendant of” with a distance == 1 as the hierarchy display for standard concepts. I’d also propse we use “Is a” and “Subsumes” with a distance == 1 works for non-standard and see how that works for folks. If it gets unruly, we can see how we might fix it as you suggested.

I’ve tested the proposed approach a bit and I do notice that the list of parent/children can get quite long. For example, when reviewing the parents for the RxNorm Ingredient metformin (steps: from the concept page, select the “Related Concepts” tab, then select the “Has ancestor of” filter under the Relationship heading and then “1” under the Distance heading), I get the following concept counts by vocabulary:

SPL: 457
NDFRT: 8
ETC: 5
Indication: 4
VA Class: 1
ATC: 1

The result is that on the hierarchy tab, I wind up with a list of 400+ entries…which makes me wish I could filter/sort them in some way. With that in mind, I would also like to see if people would find it valuable to include these types of filters directly on the related concepts table or if they would still like to have the hierarchy display? What I am proposing here is that on the Related Concepts display, we could add an additional set of filters for selecting “parent” or “children” in the case of standard concepts or “family members” in the case of non-standard concepts. What that would do is automatically select the proper relationship & distance filters in a single click while still allowing you to use the additional filters to focus attention on specific vocabularies, classes, etc. If this makes sense, we could even consider retiring the hierarchy display all together in favor of this approach.

I have to admit that I’m not entirely following all the details here, but I’ve been thinking for a while that we need to improve the vocabulary browser. I’ve drawn a little mockup below. I should probably have used a drug as an example (since that’s what I’m working on right now), but this shows an ICD9 code.

It gives a lot less information than the current browser on a given page, but I think arranges it more understandably and gives easy access and a clear path to anywhere you might want to navigate from a single concept.

On the left side is the “insular” hierarchy for the concept’s own vocabulary. There’s a path up to the root (for ICD9 I took the liberty of going up beyond the CDM’s root to the ICD9 categories) above and entries for children below. In this case the vocabulary has only one more level going down. One is probably all that should be shown anyway, but maybe with some indication of how many descendants each child has.

To the right of the insular hierarchy is a set of boxes showing the other vocabularies this concept is linked to. Each of these boxes focuses on the closest related concept and gives some indication of what one might find by navigating up or down in that vocabulary’s hierarchy from that concept. (If the focus concept is directly related to more than a single concept in some other vocabulary, that vocabulary could show up in multiple boxes.)

Whichever boxes contain standard concepts (SNOMED here) would be shaded green and contain record counts; the others would be shaded red.

In this example the concepts only have a single path going up for any given vocabulary, and of course that won’t always be the case, so the paths going up could be somewhat more complex. But since the only hierarchies shown are insular to a specific vocabulary, they shouldn’t be terribly unmanageable. Or, in cases where they are, we could just show a single level upwards with indications from each parent of what you’ll find by continuing to proceed upwards.

What do you think? What would be lost by keeping all hierarchical representation “insular” to specific vocabularies? If it’s not sufficient to show the insular hierarchy plus the bridges to hierarchies in other vocabularies, I think this UI idea could be extended, but I don’t really know what people need in this regard.

Friends:

This is a key problem we need to solve: The ability to effectively navigate this space. I agree with @anthonysena that listing >400 concepts through >40 pages is not useful. This design element is just not working in such cases. And I like @Sigfried_Gold’s direction of a topological graphical display.

The problem is that our design has to deal with several problems:

  • Medically meaningful relationships (like ‘Anatomical site of’) vs navigational/hierarchical relationships (‘Is a’, ‘Equivalent of’)
  • One related concept/parent/child vs a few relateds/parents/children vs many many relateds/parents/children
  • Non-standard vs. standard concepts, where we want to discourage the use of non-standard ones (they are not compatible with the CDMs that are ETLed from other coding schemes, we really need to wean people off those ICD9s).

I can think of a number of ways to tackle this:

  1. If we want a single standard navigator of things, it proboably needs to have two views simultaneously:
  • An overview topological view where we are (like a dot on the US map), and
  • A local view (the streets around my house in Cambridge).
  1. We need to have “flexible” design elements depending on the size of the topological neighborhood:
  • If the result set is up to a small amount (say, 5) elements, like 1 ATC, 1 VA Class, 4 Indications, 5 ETC, just list them.
  • If the result set is more than 5 put in a place holder, like “457 SPL”, “8 NDFRT”. Upon being clicked, this placeholder should provide the ability to drill down by offering a search box or filters.
  1. We may provide a “smart” context-sensitive navigation dependent on what we are showing. So, in case of a Metformin we know it is a drug ingredient, and for those the screen might offer several specific searches:
  • “Show me drug classes for this ingredient”,
  • “Show me indications for this ingredient”,
  • “Show me source codes for this ingredient”,
  • “Show me drug forms used with this ingredients”,
  • “Show me dosages used with this ingredient”.
  1. is probably where we need to go eventually, since nobody will really truly manage to navigate 1) and 2), with the exception of a few hard-core orienteering experts. But it is a lot more work.

Looks like we need a design session with a large white board for this.

Definitely a design session would be good. But let me see if I can argue that my proposed design does what you want, or try to understand why it doesn’t.

Most of the complexity we’ve been dealing with, I believe, is due to cross-vocabulary relationships. There are very complex individual vocabularies, like SNOMED, but on their own and from the local perspective of a single concept, I believe they are not that difficult to navigate.

I’m sure your point #3 is right, @Christian_Reich, that it’s possible to design better navigation tools if we allow customization to particular contexts. But the context wouldn’t need to be, e.g., a vocabulary-independent ingredient. We can start from Metformin in a specific vocabulary like RxNorm. Then if we want to navigate to classes we can wander over to ATC, if we want indications we can go to that vocabulary, etc. With the layout I’ve proposed, even if a user didn’t have much familiarity with the different vocabularies, they would get a sense of what was available, at least in other vocabularies with concepts directly linked to the concept under inspection. For some vocabularies, like SNOMED, it would probably be ideal to customize navigation depending on the concept under inspection, but for most, a consistent navigation UI would probably suffice, and, I suspect, a single navigation UI would probably do ok across all the vocabularies if we confine ourselves to insular (intra-vocabulary) hierarchical relationships and direct inter-vocabulary relationships.

To address your list of problems:

  • Medically meaningful relationships (like ‘Anatomical site of’) vs navigational/hierarchical relationships (‘Is a’, ‘Equivalent of’)

With both of these types of relationships, again, I believe that most of the complexity is removed and most of the meaning is retained with intra-vocabulary navigation and hopping across vocabularies to go from source to standard, standard to source, or to follow relationship types not available in the vocabulary under inspection.

  • One related concept/parent/child vs a few relateds/parents/children vs many many relateds/parents/children

Again, I suspect most of the link explosion is cross-vocabulary.

  • Non-standard vs. standard concepts, where we want to discourage the use of non-standard ones (they are not compatible with the CDMs that are ETLed from other coding schemes, we really need to wean people off those ICD9s).

With shading and record counts I think we get the best of both worlds: people navigate where they want, but they clearly see when they are in neighborhoods not attached to patient records and how to move towards neighborhoods that are.

  1. If we want a single standard navigator of things, it proboably needs to have two views simultaneously:
    • An overview topological view where we are (like a dot on the US map), and

This would be really great. If anyone has funding to work on it, I hope you’ll think of me :smile:

    • A local view (the streets around my house in Cambridge).

That’s what my proposal addresses, right?

  1. We need to have “flexible” design elements depending on the size of the topological neighborhood:

Totally agree.

So, please let me know if I’m misunderstanding the challenges, or if I need to be clearer about what I’m proposing. Thanks!

@Sigfried_Gold:

Here is the thing: Anthony et al. need a quick fix to the way they display hierarchies in the current version of the vocabulary browser. Let’s keep this debate focussed on that problem, since we don’t want to throw away what we have now like the child with the bath water. I kind of feel bad usurping Anthony’s Forum post, particularly after encouraging him to bring it here from the bug report discussion in Github.

And then we start a new one about potential new designs for a future semantic browser. Can you open a new Forum post and paste what you just wrote, so I and others can respond?

t