OHDSI Home | Forums | Wiki | Github

Export final list of patients in propensity-matched cohorts

We want to use OHDSI tools (e.g. PLE) to generate propensity-matched cohorts, but then export the model and the final list of patients in the propensity-match target and comparison cohorts so that we can do additional analyses on them.

One specific example is doing cost analyses on target vs. comparator cohorts (applying the propensity matching model). We don’t have cost data loaded into our OMOP model, but can attach it separately (since we have a lookup table back to our original patient information).

I see how to retrieve the propensity model itself.

How do I export the final list of patients that are used in the propensity model?

You’ll need to work with the R data objects that CohortMethod generated on your local drive. There should be a file called ‘outcomeModelReference.rds’ that details which files belong to which analysis.

Specifically, you’ll need the ‘strataPop…rds’ files, which contain the matched populations. If you want to link those back to the original person_ids, you’ll need to join those to the cohort table in the CohortMethodData objects (on the personSeqId field). (Reason for the latter: person_ids in the database are 64-bit integers, but those are not (well) supported in R. We therefore use 32-bit auto-generated personSeqIds throughout CohortMethod, and preserve the original person_id as strings in the CohortMethodData object).

1 Like

@schuemie , thanks. Looks like I can get what I need essentially like this:

library(dplyr)

# Get results (contains mapping of person_id to PersonSeqId
cmData <- CohortMethod::loadCohortMethodData('./cmOutput/CmData_l1_t452_c453.zip')
# Get propensity matching strata
strataPop <- readRDS("./cmOutput/StratPop_l1_s1_p1_t452_c453_s1_o454.rds")

# Filter to just needed columns
strata <- strataPop %>% dplyr::select(personSeqId, treatment, propensityScore, stratumId)
mapSeqIdToPersonId <- cmData$cohorts %>% dplyr::select(personSeqId, personId)

strataExport <- strata %>%
  inner_join(data.frame(mapSeqIdToPersonId), by=('personSeqId'))
  
# Write that output to CSV or database

Do others need this capability? If so, should be easy to add it to CohortMethod, and provide the option to specify which fields to output, plus option to write directly back to database if desired.

t