I don’t know if either of the suggestions below are viable alternatives to updating OSM2 but I am pursuing both:
-
At last week’s AMIA, I met Jason Walonoski from Mitre, one of the core developers of the Synthea patient-level simulator (https://github.com/synthetichealth/synthea) This is a Markov discrete event simulation engine that uses templates for demographics and disease state machines to simulate birth-to-death patient histories. At this time, about a dozen diseases have been modeled using demographics for the State of Massachusetts. The current engine outputs CCDA, CSV, FHIR and HTML. I hope to find a summer student to create OMOP CDM V5.1 compliant CSV files. I’m trying to convince folks in Epidemiology to have students create new disease models as a class assignment to expand the span of simulations.
-
MIT published the following machine learning-based approach to creating a simulated version of a real database. The simulated database retains the same distributions as the original. See http://news.mit.edu/2017/artificial-data-give-same-results-as-real-data-0303 for a summary and http://dai.lids.mit.edu/SDV.pdf for methodological details. I have a biostats post-doc looking at implementing this approach against our EHR database.