I am currently building an ETL to translate an oncology database in the UK into CDM. In our database, we have the cancer diagnosis and then we have two columns I am struggling to map. These are
‘Current disease status’ which can be either ‘current evidence of disease present’ or ‘no current evidence of disease’ or missing
‘Sites of current disease’ which can be 1 or more different sites.
I am unsure of how to map these columns in CDM, I am guessing they will be measurements but I’m not sure what concepts might be appropriate.
Welcome to the family. BTW: there is an Oncology Working Group where you can bring these things up and find a lot of like-minded folks.
What you have as “current disease status” we call “dynamic disease episode”, i.e. the activity of the disease at the time of the record. Cancers are dynamic diseases, they progress until you treat them, so they go into remission.
I agree, remission nicely covers the ‘no active disease’ patients, for the active disease, these can be either of the options you presented and will be a mixture so I was hoping there would be a more general option I could map to as there is no way to tell (without going into notes which is not appropriate for this project) which of these the cancer is.
Yeah. We may have to add a concept that for that. Why don’t you start with a 2-billionaire (concept_id>2B) for the time being, and we get that resolved in the next vocab release.
Ah, yes. What you call “site” is called “topography” in OMOP speak. There are all the sites you need either as SNOMED or ICDO3 (usually more detailed). Map to these.
But that is not enough. You need to combine them with the histology concepts (from CONDITION_OCCURRENCE or MEASUREMENT tables) and create a cancer condition. Want to talk about that?
So I have a cancer condition already mapped using ICD10 with measurements associated such as TNM stage, differentiation, side (for lung data).
Are you saying that sites should be combined into the condition, I assumed they would be added as a measurement to describe the current status of the condition. ie I would map my condition (C34.3) to ‘Liver’, ‘Lung’, ‘Brain’ as measurements meaning these are the sites?
So, the way we want cancer to work is that the combos of topography and histology constitute the condition, while TNM stage, differentiation, metastases, lymph nodes etc. are attributes of the condition and live as a Measurement. They can be linked to the condition, but we are happy when that link is not explicitly given.
The C34.3 is mapped to what is called a “shallow” cancer condition concept, because it has the topography (site), but the histology is trivial (“malignant neoplasm”). If you have histology information in your source data you can “deepen” the condition by creating the combination after the fact. So, you get the site “lower lobe of lung” from the C34.3, and maybe you have an concept for “adenomcarcinoma” from a path lab or physician notes. These could then be combined to the “deep” condition “adenocarcinoma of the lower lobe of the lung” and written as an additional Condition record.