OHDSI Home | Forums | Wiki | Github

Adding a new cohort from Synthea generated csv files to an existing CDM

We have a CDM with target and outcome cohorts from a training dataset. How can we add a new cohort from another set of csv files to this CDM so that we can do prediction on the new cohort?

your question is confusing. Imagine that CDM data is stored in a database. And cohorts are sets that you define on top of this data.

So to add new patients - you modify the CDM database.

You must clarify your question.

I made a CDM using csv files generated from Synthea (https://github.com/synthetichealth/synthea/wiki/Basic-Setup-and-Running) using the commands mentioned on OHDSI GitHub page (https://github.com/OHDSI/ETL-Synthea). I built a target and an outcome cohort on this CDM, and used PatientLevelPrediction package to obtain a PlpModel.

I aim to use this PlpModel to make predictions on another dataset. For this I tried the following:

I made another CDM using a different set of synthea csv files, on which I wanted to predict outcomes (predictPlp function) using the previously built PlpModel. I made one target cohort, defined covariates, obtained plpData, but got stuck with createStudyPopulation function. It does not run without an outcome cohort ID. While I am trying to make predictions, I only have a target cohort, not an outcome cohort.

I got stuck with the previous attempt and need help with this next one:

I would like to add the another set of synthea csv files to the preexisting CDM, make a cohort of that data, and predict outcomes in this cohort using the previously made model (using predictPlp function). How can I add this new cohort from csv files to an already existing CDM, keeping a clear distinction between old data and new data?

1 Like
t