See below email text for context as I was originally responding to that…
I have been going down the path of option 2, leveraging the LOINC codes in the Measurement domain for stage. This is pragmatic for us, because basic staging for solid tumors is readily available for us in the feed we get from oncology provider offices seeking verification to use a non NCCN compendia protocol for a particular patient. I can automate ingestion of this staging from this feed, but the TNM specificity either requires manual clinician input of this information based on the path reports we receive or nlp extraction, which is currently asynchronous for us and requires the ability to input those measures at some point later once the extraction is complete. Thus, we will always have stage group (LOINC 21908-9) and subsequently we hope to extract TNM for as many cases as possible which I plan to use LOINC 21905-5, 21906-3 and 21907-1 to record T, N and M, respectively. However, this is yet another option from what you described. Is this wrong? Also, we are relating this on the measurement domain to the visit occurrence id associated with the condition occurrence id when the patient’s cancer diagnosis was first documented. Shouldn’t that work?
As for ICDO, what you have written makes complete sense. We currently get ICD10 from claims and via our protocol review application and our clinicians document the specific histology (usually and unfortunately as simply a more descriptive diagnosis rather than a particular code), and we are in process of mapping these to ICDOs which will enable us to use those for condition occurrence rather than just the ICD10… however, if I already have ICD10 based condition occurrences loaded, would it be appropriate to associate another condition occurrence ID based on ICDO to the same visit occurrence id to which we are associating a condition occurrence using ICD10? Or would I update the existing condition occurrence ids from ICD10 to ICDO?
These seem like two different questions but are related: If best information available on initial load was less than desired specificity (eg. ICD10s or LOINC stage group) and after analytics and clinical review, we get more specific information, should the condition occurrences and measurements be updated to reflect the more specific information, or do I just add additional measurement ids and condition occurrence ids with the greater specificity (eg ICDOs or LOINIC TNM codes) associated with the same visit occurrence ids? Thanks, JB
Email text:
The OHDSI Oncology Group is trying to converge on a standard way to represent oncology diagnoses and oncology diagnosis modifiers (like staging, grading, biomarkers etc.) within OMOP. We have created the OMOP Oncology Extension Extension (the oncology extension). In the oncology extension, we have adopted ICDO as the standard way to represent oncology diagnoses. We have pre-coordinated the most common ICDO histology/site combinations and mapped or subsumed them to SNOMED concepts. The extension recommends that these pre-coordinated, ICDOhistology /site combination concepts land in the CONDITION_OCCURRENCE.condition_concept_id field and the EPISODE.episode_object_concept_id field.
However, we want to be able to further refine these oncology diagnoses with oncology diagnosis modifiers. Unfortunately, unlike with ICDO base oncology diagnoses, there is no standardized vocabulary of oncology diagnosis modifiers. Further, OMOP currently contains vocabularies with duplicative, overlapping options for representing many oncology diagnosis modifiers. Thus, currently, an OMOP ETL developer is forced to choose whatever “seems” right.
Let’s say an ETL developer wants to record an oncology diagnosis of ICDO histology ‘8140/3 Adenocarcinoma, NOS’ and ICDO site ‘C18.2 Ascending colon’. Currently the oncology extension recommends to map this ICDO histology/site combination of 8140/3-C18.2 to ‘Adenocarcinoma of ascending colon’ OMOP Concept ID 44502439. But if the ETL developer wants further modify this oncology diagnosis with, for example, pathological TNM Staging for AJCC Version 6 T=pT3, the ETL the developer is faced with a couple options:
Option 1: Map T=pT3 to SNOMED code 395707006 ‘pT3: Tumor invades through the muscularis propria into the subserosa or into non-peritonealized pericolic or perirectal tissues’ OMOP Concept ID 4193681 in the Condition domain. But then how does the ETL developer relate this entry in CONDITION_OCCURREDNCE to the entry in CONDITION_OCCURRENCE for the base oncology diagnosis? FACT_RELATIONSHIP?
Option 2: Map T=pT3 to LOINC code 21899-0 ‘Primary tumor.pathology Cancer’ OMOP Concept Id 3016308 in the Measurement domain and LOINC Answer ID LA3624-9 ‘T3’ OMOP Concept Id 45876313 in the Meas Value domain. Again, how does the ETL developer relate this entry in the MEASUREMENT table to the entry in CONDITION_OCCURRENCE for the base oncology diagnosis? FACT_RELATIONSHIP?
Either of these options could work. But the OMOP community needs to converge on a ‘standard’ representation.
The oncology extension recommends placing oncology diagnosis modifiers within the MEASUREMENT table. These MEASUREMENT entries should point to a parent CONDITION_OCCURRENCE or EPISODE entry by populating the new polymorphic foreign key: MEASUREMENT.modifier_of_field_concept_id and MEASUREMENT.modifier_of_event_id. This would seem to point to using ‘Option 2’. Unfortunately, most EHR/LIMS source systems do not contain discrete oncology diagnosis modifiers. The College of American Pathologists (CAP) has the CAP Cancer Protocols that does have a machine-parseable distribution named CAP eCC. The CAP Cancer Protocols standard is a comprehensive, frequently updated vocabulary of oncology diagnosis modifiers that is tightly bound to actual clinical practice. Adherence to the CAP Cancer Protocols is mandatory for CAP and COC accreditation. Unfortunately, CAP eCC is a proprietary standard that is not mapped to standardized vocabulary (like SNOMED or LOINC) and most EHR/LIMS do not discretely encode oncology diagnosis modifiers. The University of Nebraska Medical Center is working on normalizing the CAP eCC proprietary format to open-source and standardized vocabularies via the Nebraska Lexicon project. However, at this time the Nebraska Lexicon only covers a small number anatomic sites.
In the US, the most widely available source system containing “discrete” oncology diagnosis modifiers is NAACCR formatted tumor registry data. NAACCR is a data dictionary format for the tracking of oncology diagnoses, oncology diagnosis modifiers and oncology treatments. All US facilities diagnosing and treating cancer patients are mandated to report their data in the NAACCR format to federal and state agencies. Most NAACCR data is manually abstracted from patient charts by certified tumor registorars.
The oncology extension wants to be able to support the ingestion of oncology diagnosis modifiers from NAACCR tumor registry data. To enable this use case, the oncology extension recommends adopting the NAACCR tumor registry vocabulary as the standard OMOP oncology diagnosis modifier vocabulary. Thus, an Option 3 arises that dictates that cancer staging data should be encoded in OMOP in the NAACCR vocabulary. The ingestion of the NAACCR data format is currently under construction by the OMOP vocabulary team.
In the future, the hope is that NAACCR vocabulary could be transitioned from the standard OMOP oncology diagnosis modifier vocabulary to a source vocabulary. The current vision is that the Nebraska Lexicon will be adopted as the standard oncology diagnosis modifier vocabulary and a mapping will need to NAACCR vocabulary to the standardized SNOMED/LOINC concepts specified by the the Nebraska Lexicon
We are going to discuss oncology diagnosis modifiers at our next OHDSI Oncology Workgroup meeting. We will review a concrete SQL example mapping cancer staging to NAACCR in OMOP. Either at the meeting or via email, please let us know your thoughts on the current direction we are taking. We definitely what to hear other perspectives.