Hi OHDSI community,
Daniella Meeker and Michael Matheny are doing a demo at AMIA of the pSCANNER PopMedNet and additional analytic software. We are looking for de-identified data, and that SEER data would work. I know the CMS-ETL WG had done some data transformation of SEER to OMOP, but I don’t know where it ended, and if there was more than is available on the forum. I did see some ETL specs v1.0. We could use either a transformed dataset or ETL code or ETL specs- whatever one is will to share. We’d then be transforming to PCORNet CDM for the demo. Other suggestions of de-identified data for the demo are welcome (it will put on a laptop, so anonymity is key)
The Medicare SynPUF data is good for demo purposes. I don’t think the SEER data is in a place where it could be used. The data use agreement for the raw SEER data, and its unique vocabulary and structure, probably makes it a challenge to set up for a demo.
Link for SynPUF data http://www.ltscomputingllc.com/downloads/
Thanks @DTorok and @Mark_Danese! - given Mark’s warnings, I’m wondering now about the CPRD data- but I don’t know if this is freely and readily available or not.
@Christophe_Lambert has updated the ETL and the data. This post has the most current version, as well as links to the code. Go to the very bottom: Test CDM v5 dataset
You may want to consider accessing HCUP from AHRQ, it’s a hospital dataset
that is cheap, the ETL is posted here:
If you are looking for hospital data, the MIMIC III dataset (
https://mimic.physionet.org/about/releasenotes/) is a good choice. It
doesn’t have an ETL into the OHDSI schema (at least as far as I know!) …
but its a superb dataset. (and we as a community will benefit from someone
converting it into the OHDSI schema).
Thanks@patrick_ryan! Is there ETL for v4? I couldn’t find it.
(and we as a community will benefit from someone converting it into the OHDSI schema).
See our project here: (also come to see our poster at the symposium about mimic)
The mimic team made us remove the CSV files. So you have to download the data from official mimic source and use our script.
We are working on the Achilles JSON files and hope to put those on the public ohdsi Achilles instance.