OHDSI Home | Forums | Wiki | Github

How current is the ICD-10-CM vocabulary files available for download?

I am new to the forum for please forgive me if this is old news. I am working with current projects implementing OMOP and FHIR. The problem I am having is that the ICD-10-CM vocabulary is not up to date with the latest release. As such, I am ending up with about 22% of the conditions being loaded for the past 15 months with no found concept_id.

I could load the missing content in my own instance, but this would not benefit others who expect the concept_id to be standard across all OMOP CDM instances.

Thanks,
Jeff

1 Like

Source data codes:

Creating local concept id

  • create a table with the following fields: as in http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:concept
    Some options
    ---- concept_name: code descriptor, the long descriptor for ICD10
    ---- domain_id: best practice is to make it ‘OBSERVATION’ domain. It’s an opinion.
    ---- vocabulary_id: of the various vocabulary ID which one corresponds best? use that http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:vocabulary
    ---- standard_concept: NULL
    ---- concept_code: source code
    ---- valid_start_date: 1-Jan-1970
    ---- valid_end_date: 31-Dec-2099
    ---- invalid_reason: NULL for now. if in the future, omop community supports the code you are interested, then replace this with invalid, replace the concept id with communities concept id in post ETL data.

How to assign concept id:
Count up starting 2,000,000,001. As you add more rows, make sure this is unique.

Required reading:

http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:data_model_conventions
http://www.ohdsi.org/web/wiki/doku.php?id=documentation:cdm:concept

Values for concept_ids generated as part of Standardized Vocabularies will be reserved from 0 to 2,000,000,000. Above this range, concept_ids are available for local use and are guaranteed not to clash with future releases of the Standardized Vocabularies.

@jsjacobs, thanks for the inquery,
@Gowtham_Rao, thanks for the help with a simple approach description.

New ICD10CM concepts with corresponding mappings to SNOMED will be added to OMOP vocabulary
in the future.
I’m wondering how you ended up with 22% of missing conditions (maybe due to the specificity of your dataset), but ICD10CM added 2305 concepts in the last release, that is about 2.5 % of all the ICD10CM concept count.
The approach of @Gowtham_Rao is quite good, if you need this concepts before we add them officially to the vocabulary.
I want to note:
—Domain_id: better to put “Condition” as there are diseases mostly (A92.5 Zika virus disease, C49.A, Gastrointestinal stromal tumor, etc.), but if you don’t know - “Observation” is a best option.
— valid_start_date: “1-Oct-2016” – as the date when these concepts were added (and of course if you don’t know whether there are only new concepts, put “1-Jan-1970” as a valid_start_date.
And ICD 10 and ICD-10-CM are different vocabularies.
ICD 10 is WHO version (15297 concepts), and ICD-10-CM is the US extension (91737 concepts).

@jsjacobs:

I wouldn’t add ICD10CM concepts for the reason you are mentioning: Your concept_ids will not be persistent. In fact, they may well be ovewritten, unless you use >2Billion as @Gowtham_Rao suggests.

Can you wait a day or two?

Going back and reviewing the instance based on your and Gowtham’s comments about domain_id of “Observation” I found that many of the ICD-10-CM concept codes actually did already exist, but were identified as Observations and not Conditions. I had naively assumed that diagnoses reported with ICD-10-CM were always Conditions. I had added them starting at 2,000,000,001 as Gowtham has recommended so it is no problem for me to go back and clear them up where needed. The question is whether these all remain Condition Occurrences that are just using the domain id of Observation or if they now should be mapped to Observations in OMOP.

Anything worth doing is worth doing well. :smile:

Thanks,
Jeff

Yep, some of ICD10CM concepts are Observations, for example:

W61.4 Contact with turkey
W22.01XA Walked into wall, initial encounter
V97.1 Person injured while boarding or alighting from aircraft
and so on,
Thus these concepts are mapped to Observation and it’s totally OK for them to be in Observation_occurence table.

@jsjacobs:

As @Dymshyts says: We are reassigning the “essence” of these concepts. A Condition should be a diagnosis, or a sign or symptom of a disease. Vocabularies used in billilng and reporting notoriously sway away from that tight definition, because that’s a requirement for their primary use case. Even though they are called “Classification of Disease”. CPT4 and HCPCS are much worse. The procedures are barely a majority in them, if that.

@Christian_Reich, is there a target date for ICD10CM to be published and available on the athena page?

Full ICD10CM dataset will be refreshed in the beginning of June. It will have new 2305 concepts and some fixes related to existing mappings. But still no mapping for these new 2305 concepts.
But anyway you can use them in your CDM filling source_concept_id, condition_source_value with ICD10CM concept_id and concept_code, and condition_concept_id = 0.

Just adding to this thread. In Athena, it looks like the latest refresh was April 28. Did the June refresh happen? We’d be happy to get an update.

We have found ~1,900 ICD10CM codes that are used in VA, are really ICD10CM codes (I hand checked a few dozen of these on icd10data.com - especially the very common and even the weird looking ones), but aren’t in the CONCEPT table. For almost all these concepts a less granular ICD10Code exists in the CONCEPT table (R97.20 is our code that isn’t in CONCEPT, but R97.2 does exist in CONCEPT) or sibling codes exist (D78.31 is our code that isn’t in CONCEPT, but D78.01, D78.11, D78.21, and D78.81 are in CONCEPT).

Here’s a link to the file with the missing ICD10 codes: https://www.dropbox.com/s/ryts5ossymu3w7k/Missing_ICD10_codes.xlsx?dl=0

Looking on your examples:
select concept_code, concept_name from concept where concept_code in ( ‘R97.20’, ‘D78.31’ ) and vocabulary_id = ‘ICD10CM’ and invalid_reason is null
;
result:
D78.31 Postprocedural hematoma of the spleen following a procedure on the spleen
R97.20 Elevated prostate specific antigen [PSA]
So they are in a latest vocabulary release.

But if you look on concept_relationship table, you don’t find the mappings to SNOMED, that’s true. But it’s another issue. And we’re working on it.

t