OHDSI Home | Forums | Wiki | Github

Proposal to keep outdated standard concepts active and standard

(Dmytry Dymshyts) #1

We reviewed the logic of standards concepts deprecation.
So far when the source removes the concept from list of the active concepts, we deprecate it, and it doesn’t matter what is the reason of deprecation:
incorrect concept - duplicate, having wrong description (e.g. wrong drug dosage), some weird classification branch, etc.
correct, but outdated for the other reasons
for example CPT4 in the last release makes “76645 | Ultrasound of breasts” concept outdated (maybe it’s not a billing code anymore, because you need to specify some other details), or this one: “97001 | Physical therapy evaluation”, and also there are more granular concepts.
But the thing is that these procedure really happened and these codes exist in a patient data for a years and might be included in a protocols, concepts sets, etc. So it’s totally OK to have them in a data as a standard concepts.

The proposal is to keep these concepts standard and active for CPT4, HCPCS and other vocabularies with the same out-dating principles. And “resurrect” them in our vocabularies making invalid_reason = null and standard_concept =‘S’

What do you guys, think?
@Christian_Reich, @ericaVoss, @TBanokina, @Mark_Danese, @IYabbarova,

Running WG Agendas and Notes
Condition mapping improvement using SNOMED Extension proposal
Running WG Agendas and Notes
NDCs mapping to RxNorm Extension
Odd validation of concept id
Handle duplicate Drug Name and select appropriate concept id
(Don Torok) #2

In doing our ETL we trust that the Maps to relationship is correct and do
not look at the invalid reason for the concept since it may have been valid
at one time. So it will not make a difference for the ETL. Obviously part
of the problem is that D is overloaded with meanings. May be time to add
another value such as ‘O’ for obsolete.

(Christian Reich) #3


Yeah. This has come up from time to time.

The question is do we have use cases where we need the official end date of a concept like that. I can’t think of one.

(Christian Reich) #4

This is about those codes that are are Standard Concepts, Don. So, if the source organization deprecates them (e.g. when a CPT-4 code is no longer available for billing), then it becomes invalid in the vocabulary, and then it stops being a Standard Concept. And then you can’t have it anymore in data tables, even though it was totally fine till they pulled it.

Folks have used different workarounds: They created another concept copy and made it active and Standard, or they just used the deprecated as is. Neither one is good.

(Mark Danese) #5

We just learned that HCPCS codes can be reused 4 years after deprecation. So, technically, it might be important to know the valid years for a code. This reuse applies to NDC as well, although I don’t think there is any official set of dates for an NDC code.

Other than the above issues, I am not sure of the utility in describing codes as “deprecated” since we are always looking at the code used at the time of the clinical event. If we were designing an EMR, it would be important. But not for “retrospective” data.

(Vojtech Huser) #6

A data quality script may find it useful that a concept is outdated and the end date (when it became outdated). It may flag the data as “strange”.

(Dmytry Dymshyts) #7

Looks like, if concept is outdated, we need to:
keep the concept Standard ,
but also reflect the out-of-use date,
and put “O” or other flag as invalid reason.

Actually this change requires a lot of changes in our vocabulary building and maintaining algorithms,
@Christian_Reich, let’s think, if it’s really needed or we just make a “dirty” change for them:
the end_date = 2099 year, and invalid_reason = null?

(Michael Kahn) #8

I am not sure I am completely following the discussion so the following comment may be incorrect: From the postings, I believe the current approach is that an outdated concept is no longer considered STANDARD and is removed from the MAPS TO relationship. Is this correct? If not, don’t bother reading further.

If true, my question is: Will the outdated concept also be removed from the CONCEPT_ANCESTOR table? If this is true, then I can foresee a real disaster in existing/legacy queries in data warehouses that retain the legacy concept (STANDARD @ ETL but now outdated). Queries that reference higher level categories that use the CONCEPT_ANCESTOR table to dynamically expand to leaf-level STANDARD concepts would fail to work correctly. Is this correct?

(Tatiana) #9

Invalid_reason = ‘O’ may cause problems with already converted datasets. I know that some people apply both filters standard_concept = ‘S’ and invalid_reason is null for validating target concepts.

I like this idea because currently I see that mapping rate on the same data is getting worse with the new vocabulary release. I think it is important to be able to work with the old records.

(Dmytry Dymshyts) #10


Yes. It does. (it’s what actually happens now)

So now you see why I came up with this proposal:)

(Dmytry Dymshyts) #11


so keep it null, but the end date = “the out-of-use date”

that’s exactly what I’m talking here;)

(Christian Reich) #12


You are correct, and I read on. But it is not that simple.

Yes, it will be removed. But there are two cases really here in play:

  1. Source codes get mapped to a different standard concept. This is the case for Drug and Condition, since (except Charlie) nobody codes in RxNorm or SNOMED. So, if a Standard Concept dies, all the sources will be remapped to new Standards, and we are all set.

  2. Source codes are Standard Concepts. Unless we have the case of an upgrade (there is a new Standard Concept), we have a problem, because the now dead Concept has nowhere to go and gets mapped to 0. We lose data, even though the code and Concept were fine till very recently. That’s the use case we need to tackle. The hierarchy follows that just fine.

(Christian Reich) #13

Not sure we need to. Our valid_end_date has nothing to do with the fact that drugs are off the market, NDCs are no longer used or CPT-4 codes are not accepted for billing anymore. Our valid_end_date means that the semantic meaning of a concept is no longer upheld. So, from that perspective we should keep as is.

Right now the discussion in THEMIS is to create lists outside the standard Vocab tables for those ultra-rare cases. So, I wouldn’t abuse the valid_end_date any longer for this situation.

(Michael Kahn) #14


  1. “source codes get mapped to a different standard concept.” True for those who do a complete ETL/code mapping with every new vocabulary release. Not true for those of us who are using OMOP as their core data warehouse model that do daily incremental loads. In this setting, the old standard concepts remain.

1a. ETL processing could be altered to remap every source code to the most-recent standard concept each time the vocabulary is refreshed. This would be a pretty massive remapping job at every vocabulary update.

1b. Still breaks those historical (often repeated) queries that reference terminal standard concept_id’s directly or indirectly. By indirectly, I’m think about queries that reference a Classification concept as an entry point into the concept_ancestor table and the Classification concept is retired. All of these would break.

  1. Consider adding standard_concept value of “H” (beyond just “S” and “C”) for historical and keep these concept_ids, mappings and ancestors always in place but marked as H. Maybe this is just another way of stating what others are proposing. The bottom line would be to NEVER get rid of any code or MAPS TO, or ancestor map that existed previously. I can only begin to imagine what this would do to the current terminology creation process…

(Dmytry Dymshyts) #15

@Christian_Reich, @mgkahn,
Ok. at least one rule defined exactly:
our valid_end_date value and invalid_reason =‘D’ means that concept is wrong. Wrong parsing of drug dosage and then wrong RxNorm Extension concept, or wrong SNOMED entity (duplicate or non-sense).
In this case it’s totally OK to remove these concepts from Ancestor and Not to map TO these concepts anymore (as it is now).

But second: What to do with Outdated concepts (for example, ICD10CM changed their classification and now this disorder belongs to different sub-chapter, but, say, 2 years ago, doctors used this code to encode the disease)?
To be technically correct we need to create another “end_date” column, kind of “out_of_use_date”. And keep the standard_concept_id = ‘S’, (nobody will rewrite existing queries changing ‘S’ to (‘S’, ‘H’) - anyway you proceed Historical and Active concepts in the same way).
I think, people are also used to search the concepts using “where… invalid_reason is null”.
So less painful is not to touch invalid_reason logic, but add another field “active_status”.
So what I propose: Add 2 new columns: “active_status” and “out_of_use_date”.

@TBanokina, what can you say from ETL perspective about this approach?

(Dmytry Dymshyts) #16

So what?:slight_smile:

(Christian Reich) #17

Oh. I see. We need to figure out something. Otherwise you will deviate increasingly from the standard, which will put your ability at risk to do network studies. This is a big deal.

That’s what folks do who don’t update continuously. A full job on all the data. Are you going to run into infrastructure problems, or why would you not want to do that?

Well, “break” also means “being fixed”. Goes both ways.

But yes, we need to think about a mechanism to update a query. I think that can be automated.

The problem is we don’t control that. Most of our content comes from the sources. If SNOMED updates concepts or links we cannot just ignore that.

Exactly. So, the solution is that we need to string the CDM data along and update the queries. No way around it.

What’s the use case? Why would we need to know it’s no longer in use? The only case I can think of is for purposes of debugging cohort definitions trying to identify the reasons for change over time. Is that it?

I also a lot of “where sysdate between start_date and end_date” notations. We will have to do a THEMIS convention.

(Vojtech Huser) #18

I can see me voting Yes for the proposal

So what I propose: Add 2 new columns: “active_status” and “out_of_use_date”.

So that CPT4 “76645 | Ultrasound of breasts” that is correct but outdated has active=0 and out_of_use_date=‘2017-07-01’ and has ‘S’ standard status

(Dmytry Dymshyts) #19

As for me the final idea is:

valid_end_date – date of source outdating or date when we removed it from the vocabulary because it was added mistakenly
Add new value of invalid_reason:
M – concept deprecated as it was added mistakenly
D –outdated by the source
U - updated concept
Standard_concept = ‘S’ for concepts with invalid_reason = null or invalid_reason =‘D’,
Standard_concept = ‘C’ respectively for classification concepts

(Christian Reich) #20


so, the real difference is that D no longer invalidates a Concept and turns it from a Standard to a non-standard. Correct? While M does it the same way D does it today.

We could just not turn a concept to D if the source wants to deprecated it. Just keep it going. Invalid_reason=null, but valid_end_date something before 2099. Wouldn’t that do the trick?