Concepts with an VALID_END_DATE <= NOW() and INVALID_REASON IS NULL

I had previously been under the impression that if a concept were to be “invalid”, it’d have the “invalid_reason” populated and the “valid_end_date” set to the date of deprecation; and similarly that if a “valid_end_date” is NOT the default of 2099-12-31 then we would expect for the “invalid_reason” to be populated.

Today I was surprised to discover that this query (on vocabulary v5.0 27-FEB-25):

select 
    vocabulary_id, 
    standard_concept,
    count(*)
from 
    vocab.concept
where 
    invalid_reason is null 
    and valid_end_date <= now()
group by 
    vocabulary_id,
    standard_concept
;

returns these results:

vocabulary_id standard_concept count
CMS Place of Service S 1
CMS Place of Service 3
CPT4 C 324
CPT4 S 1719
CPT4 431
HCPCS S 3028
HCPCS 969
ICD10PCS S 4745
ICD10PCS 29
ICD9Proc S 5
ICD9Proc 1

At first I was worried that something had gone awry in our version of the vocabularies, but I ultimately discovered a forums post describing why this happened for ICD10PCS: ICD10PCS: bringing back deprecated codes. In the post, there’s a comment that also alludes to this being the case for CPT4 and HCPCS.

My colleague wisely thought to consult the Book of OHDSI and found that Chapter 5 has this documented:

  • Reused code for another new concept
    • Description: The vocabulary reused the concept code of this deprecated concept for a new concept.
    • VALID_START_DATE: Day of instantiation of concept, if that is not known day of incorporation of concept in Vocabularies, if that is not known 1970-1-1.
    • VALID_END_DATE: Day in the past indicating deprecation, or if that is not known day of vocabulary refresh where concept in vocabulary went missing or set to inactive.
    • INVALID_REASON: “R”

However, I can’t find any other documentation referencing the use of “R” – and I’m only seeing values of D, U and NULL in the field (at least, for the vocabularies we’ve downloaded).

So I have a couple of questions:

  • Are these concepts I’ve flagged instances where the source vocabularies “replaced” / “reused” their own codes?
  • Should these have invalid_reason = 'R'?
  • Could we update documentation to make this clearer for future folks? :slight_smile:

Thanks!

Hi Will!
Reuse (by the source) and resurrection (in OMOP) are different problems.
Those that you found were not reused, and therefore should not be marked with “R”.
But those that should are not at the moment because this change wasn’t implemented yet. So the Book of OHDSI is not that outdated. In this case it describes the future :grinning:

This article should help.

Hi Alexander!

Thanks for the quick response - this is helpful. Good to know about the Book of OHDSI describing the desired future state!

For the “zombies” that is helpful - I think I’m understanding the distinction. My question is the github article specifically calls out that these concepts should have standard_concept = 'S'. But I’m seeing instances where they are non-standard or classification. I can see how the same logic applies to the classification concepts, but I’m not sure about the non-standard. Is there a reason those fall into the category of having invalid_reason as NULL?

@wtroddy Good catch!
These are one-legged zombies.
It happens because the decisions to make concepts zombies are applied on the vocabulary level.
What happens next on individual vocabulary run is:

  • we make them zombies (Standard and valid even though the end_date is in the past)
  • some of these zombies got mappings to the proper Standard targets which makes them non-Standard
  • but we still leave them valid even though we don’t need the rule exception anymore
  • the generic_update function on the integration step passes them through because it believes they’re true zombies.

@m-khitrun we need to fix that at some point. Don’t you mind creating the github issue?

1 Like

Hi!

@wtroddy thank you for reporting this. @Alexdavv sure, we need to fix this.

Thanks, Alexander! That makes perfect sense how they came to be. Looking forward to the future fix for this.

Just a thought - I know there’s some discussion about a Book of OHDSI 2.0, it might be worthwhile to think about how to incorporate some of the topics from the vocabulary github wiki or at least point to the additional resource. :slight_smile:

Hi @Christian_Reich @sseager Do you have it on your list?