OMOPped data sources for oncology research in EHDEN

(Seamus) #1

Dear All,

I am involved in the EHDEN project representing the National Institute for Health Care and Excellence (NICE) in England.

We are developing use cases to examine the usefulness of the OMOP-CDM in health technology assessment (i.e. relative effectiveness and cost-effectiveness analysis).

One of our use cases is in advanced prostate cancer. Real world data is particularly valuable in oncology research for extrapolating survival from the end of the trial to patients death, and from surrogate endpoints (e.g. progression free survival) to mortality.

We are trying to identify appropriate data sources that have been OMOPped for this work.

We require, at a minimum, individual-level data on cancer incidence, death, and year of birth. We would ideally also have information on disease progression, treatments received, and other patient characteristics. The data set must have long-term follow-up on patients.

The external data can come from a variety of sources including cohort studies, disease registries, or administrative databases.

If you are aware of an appropriate data source (or sources) please let me know.


(Talita Duarte Salles) #2

Dear Seamus,

I am an epidemiologist working in different cancer projects using mainly a primary care database in Catalonia, Spain, called SIDIAP (https://www.sidiap.org/index.php/en). SIDIAP includes anonymized individual-level data since 2006 for approximately 7 million people. We are still in the process of mapping SIDIAP to the OMOP-CDM, but I think our data could be appropriate for your use case.

Please let me know if you need more information, I’d be happy to get involved!


(Seng Chan You) #3

Dear @SKent

I am a student in Ajou University, Korea. We’re developing a R package to investigate the burden of disease, especially cancer (https://github.com/abmi/argos).
Recently, we’ve converted 10 year national claim database of whole national cancer patients into OMOP-CDM. It does have year of birth, death, and every detail in treatment.

I think this can be useful database for this use case.


(Jeremy Warner) #4

We would love for you to try out the HemOnc vocabulary for representing treatments received. Happy to chat about this more; we have a fair number of prostate cancer-specific regimens that are further labeled by context e.g., NM-CRPC, NM-CSPC, CSPC, and CRPC.

(Michael Gurley) #5

We have an 18 year old Prostate Cancer database at Northwestern for a Prostate SPORE. Covering 4500 patients. Covering the data points you describe. A mix of chart abstracted and EHR data. The database has not been converted to OMOP but we have OMOPed other cohorts. I will check if there is interest at NU.

(Seamus) #6

Thank you all for your responses! I will discuss with the team and get to you each individually to discuss when our use case is futher developed.

(Christian Reich) #7

@SKent, @seamuskent:

(Looks like you are in the system twice).

BTW: If you are working with EHDEN you may want to look at PIONEER. It’s a sister IMI program to EHDEN, but it focusses specifically on prostate cancer. Let me know if you want to engage.

(Benjamin Skov Kaas Hansen) #8

Hi Seamus,

In my lab, we have longitudinal EMR data for about 2 Danish million patients with the variables you mention (plus many more); we’re still, however, setting up our OMOP infrastructure, but we’d love to be involved.


(Seamus) #9

Thanks Christian - I will remove the one related to my non-work email. It would be great to engage with PIONEER.

(Andrew Williams) #10

We at Tufts MC have the relevant data to the extent that our Tumor Registry, state Death data, and EHR capture long-term outcomes. Can you be more specific about what you mean by long-term? If these kinds of sources fit your needs, we would be interested in exploring collaboration with you on this work.