How do you representation of mismatch repair status / microsatellite instability in OMOP?

awrosen · November 8, 2023, 7:39pm

Dear OHDSI community,

As a part of the HowOften study, I’ve tried creating some cohorts definied by having colorectal cancer and their mismatch repair (MMR) status or microsatelite (MS). However, this produced quite lower counts that we we were expecting. After a bit of investigation we didn’t find any smoking gun of concepts which we forgot to include. We are currelty using:

35919449 MSI unstable low (MSI-L)
35917471 Microsatellite Instability (MSI)
35917835 Microsatellite Instability (MSI)
35918368 Microsatellite Instability (MSI)
21493972 Microsatellite instability [Interpretation] in Cancer specimen Qualitative
35977041 Microsatellite Instable-Low (MSI-L) measurement
42537577 Microsatellite instability-high colorectal cancer
3173676 Intact mismatch protein repair function identified in malignant tumor
21493968 DNA mismatch repair protein Mlh1 [Presence] in Cancer specimen by Immune stain
21493971 Mismatch repair endonuclease PMS2 [Presence] in Cancer specimen by Immune stain
21493969 DNA mismatch repair protein Msh2 [Presence] in Cancer specimen by Immune stain
21493970 DNA mismatch repair protein Msh6 [Presence] in Cancer specimen by Immune stain
35977040 Microsatellite Instable-High (MSI-H) measurement

Has anyone have expirience with how it might be coded differently? In Denmark we are doing the test routinly on patients diagnosed with colorectal cancer, and base on guidelines from NICE it’s also recomended in the UK and by the American College of Pathologists in the US, which probably means the difference in the counts are not due to differences in care, but hopefully due to how we’re trying to define the concept.

Kind regards,
Andreas

Christian_Reich · November 10, 2023, 10:59pm

@awrosen:

These concepts are actually derived from the College of American Pathologists Electronic Checklists. What is it you are not finding? What are you counting?

jmethot · November 10, 2023, 11:28pm

At our institution we do not have those biomarkers in structured form, we only have them in pathology report text. I just checked our brand new CDM and it contains zero of those concepts in MEASUREMENT. We expect we will use methods from the NLP WG as well as internal efforts to recover those data elements from text but we’re not there yet.

jmethot · November 11, 2023, 6:17pm

I realize we do have this data for internally sequenced tumors, but we haven’t loaded tumor genomics into our CDM yet. We are waiting for the KOIOS tool to be ready.

awrosen · November 12, 2023, 7:11pm

Dear @Christian_Reich,

Thank you. We are counting the number of patients included in some cohort defintions (e.g. based on the phenotype library), where we almost got not counts when including definitions on MMR status.

The thing we’re hoping at finding to figure out if anyone has expirience with identifying these patients in an alternative way, which we could use in our cohort definition, as we a priori would expect most patients with colorectal cancer to have these test performed.

Dear @jmethot,

Thank you - we’ve also considered if the low counts is caused by an unavaliability of from the source data, where some pathology reports might not end up in the OMOP instances.

KR

Andreas

Christian_Reich · November 14, 2023, 2:54pm

Wait. Isn’t the problem that only very few databases have this type of genetic information?

awrosen · November 15, 2023, 12:43pm

This could indeed be the situation. At least we have not able to find any concepts to identify the patients. We imaged this data should be routinely avaliable from a clinical perspective common from a clinical perspective, which made us hope that it was our definition that was the problem.

Christian_Reich · November 15, 2023, 1:56pm

Hope dies last, @awrosen. Come to the Onco WG. We are actively recruiting collaborators with that type of data. Tell them you love them.