Hi all!
I am currently implementing an ETL in python to adapt some EHR records to OMOP. I wanted to ask for advice when dealing with one of our source vocabularies, which includes custom codes that include a specific subset of ICD10CM codes. Long story short, these source codes are a relation of chronic conditions of each patient. They do not always have a direct correspondence to an OMOP standard code and sometimes going “uphill” with the relationships ends up in concepts that include the original ICD10CM codes and additional unrelated conditions.
My main question is what is the preferred approach, if any, to deal with these kind of cases. I’ve been searching the forums but cannot find something related, please point me to anything I could have missed.
Our current approach is to create a custom vocabulary entry and a custom concept_id that identifies each source code, using the 2billion reserved integers. At this point we have added one row to the VOCABULARY table and more than several rows to the CONCEPT table with our new concepts. Cool. Then we add a bunch of pairs “Subsumes”/“Is a” relationships from each source code to their respectives ICD10CM codes (Each source code subsumes several ICD10CM codes and each ICD10CM codes is a source code, please correct me if I’m wrong). The final mapping to the standard concepts is done through the ICD10CM codes, which are already mapped (Thank you!). Does this seem correct?
Also, since these source codes do not always have direct mappings to any standard concepts, can we make them standard in our local CDM instances? Does it matter outside network studies?
As an example, we have a code for aneurysms, and arterial dissections that includes ICD10CM codes: I71, I72, I77.7, I77.8 and I79.0. After some SQL/python fiddling with CONCEPT and CONCEPT_RELATIONSHIP tables I have found that these codes could all be referenced to the more general Disorder of Artery, OMOP-321887, but this also includes unrelated disorders like abscesses, ulcers, and embolisms. This is not a good solution since we would be including disorders that were not considered in the first place.
I am familiar with the idea of just putting several codes in the CONDITION_OCCURRENCE table, like when ICD10 codes reference several conditions at the same time and you just use two SNOMED concepts with the respective standard concepts. But I do not think that can be applied here since the idea of these codes is to have a concept for a family of chronic conditions, without being too specific.
Thank you all in advance!