OHDSI Home | Forums | Wiki | Github

Merging Heterogeneous Vocabulary Versions


(David Blatt) #1

Good morning,

We recently ran across an issue where we merged 2 heterogenous OHDSI instances (inpatient/outpatient) into one database. Due to a previous failure, we found that one source had a stale/lagged Vocabulary. This clearly caused many concept discrepancies in our final product. A reload using consistent vocabulary can fix the problem. We also thought the relationship table might help us since the vocabulary is backward compatible. I would think all deprecated concepts are maintained and “mapped to” a new standard within a single hop but that is not that we found after a brief review. I also thought about incremental loads that over time may want to upgrade their vocabulary without a full reload. @DTorok I was told that you may have some valuable insight. Of course, we welcome feedback from anyone!

Cheers,
David Blatt


(Don Torok) #2

@dblatt, We have looked at doing incremental loads and concluded that updating visits, observation and payer plan periods and concepts was more work than re-running the complete ETL.


(David Blatt) #3

Thanks @DTorok. Any ideas about the vocabulary issue?


(Don Torok) #4

Again the same solution for the stale vocabulary is to re-run with your current vocabulary.

When a concept gets deprecated ‘D’ there is not a mapping to a replacement. When a concept is Updated, ‘U’, you will find a ‘Replaced by’ relationship.

For deprecated concepts, it makes a difference if you are talking about the source or ‘Maps to’ standard concept. For example if a drug is no longer available the NDC code may be deprecated, there is no replacement for the NDC code, but there is still a mapping from the deprecated NDC to RxNorm because if you see the NDC code in your source you assume that was prior to when it was discontinued. But if the ‘Maps to’ RxNorm code is deprecated there is no concept_relationship that gives a replacement. The source codes that had ‘Maps to’ the, now deprecated, RxNorm code will be mapped to a different RxNorm code. So indirectly there is some type of a replacement relationship, but nothing recorded in concept_relationship.

Note: I used RxNorm to represent what would more accurately be defined as a standard concept with a domain_id equal to Drug


(David Blatt) #5

@DTorok Thank you very much.

  • A deprecated code will never be standard at a moment in time but can still be mapped to a standard correct? It might also be mapped to another deprecated code…(unsure)

  • Codes that shift to a new ID are “U” updated

  • No question that a reload is less complex. We only do reloads and work to make that process more effective.

  • We build a standard_concept table that is vocab version specific. I think there may be corner cases that need to be considered. The query to build only considers the “maps to” relationship and is only used for a full load.

Again, thank you for your response.


t