Debugging no results using runCmAnalyses

jenwilson521 · July 7, 2020, 7:05pm

Hello OHDSI community,

I recently adapted an observational study pipeline using the latest version of CohortMethod, and specifically the runCmAnalyses function. The code took 18.7 days to execute and returned the expected tibble, but the tibble references file paths that don’t exist. The whole command ran without errors - except that noting that the “schema” parameter is depreciating. However, I used the syntax in the tutorial for Running Multiple Analyses.

I’d like to figure out why the code is executing so slowly - is that a reasonable time to wait?

And second, I’m not sure why the function didn’t write any results to disk. I’m running two analysis, one without matching and one with matching. The method seems to have generated “Analysis_1” and “Analysis_2” sub folders in my outputFolder and the outcomeModelReference.rds file in the outputFolder.

The only slight change in my analysis is that instead of using the drug_era table to define cohorts, I used a table that I defined because I needed to further specify drug exposure ordering when defining cohorts. Could this affect how the method tries to do propensity matching?

I included code of how I called the method below. And I’d be happy to share more code if that could help figure out if I am mis-using the method. Thanks for any insights you might have!

getDbCmDataArgs <- createGetDbCohortMethodDataArgs(washoutPeriod = 0,
restrictToCommonPeriod = FALSE,
firstExposureOnly = TRUE,
removeDuplicateSubjects = “remove all”,
studyStartDate = “20090101”,
studyEndDate = “20191231”,
excludeDrugsFromCovariates = FALSE,
covariateSettings = covarSettings)

createStudyPopArgs <- createCreateStudyPopulationArgs(removeSubjectsWithPriorOutcome = TRUE,
minDaysAtRisk = 1,
riskWindowStart = 0,
startAnchor = “cohort start”,
endAnchor = “cohort end”)

fitOutcomeModelArgs1 <- createFitOutcomeModelArgs(modelType = “cox”)

cmAnalysis1 <- createCmAnalysis(analysisId = 1,
description = “No matching, simple outcome model”,
getDbCohortMethodDataArgs = getDbCmDataArgs,
createStudyPopArgs = createStudyPopArgs,
fitOutcomeModel = TRUE, fitOutcomeModelArgs = fitOutcomeModelArgs1)

createPsArgs <- createCreatePsArgs() # Use default settings only
matchOnPsArgs <- createMatchOnPsArgs(maxRatio = 100)

fitOutcomeModelArgs2 <- createFitOutcomeModelArgs(modelType = “cox”,stratified = TRUE)

cmAnalysis2 <- createCmAnalysis(analysisId = 2, description = “Matching”,
getDbCohortMethodDataArgs = getDbCmDataArgs,
createStudyPopArgs = createStudyPopArgs, createPs = TRUE,
createPsArgs = createPsArgs, matchOnPs = TRUE,
matchOnPsArgs = matchOnPsArgs, fitOutcomeModel = TRUE,
fitOutcomeModelArgs = fitOutcomeModelArgs1)

cmAnalysisList <- list(cmAnalysis1, cmAnalysis2)

result <- runCmAnalyses(connectionDetails = connectionDetails, cdmDatabaseSchema = cdmDatabaseSchema, exposureDatabaseSchema = resultsDatabaseSchema, exposureTable = target_cohort_table, outcomeDatabaseSchema = cdmDatabaseSchema, outcomeTable = “condition_era”, cdmVersion = cdmVersion, outputFolder = outputFolder, cmAnalysisList = cmAnalysisList, targetComparatorOutcomesList = targetComparatorOutcomesList, getDbCohortMethodDataThreads = 1, createPsThreads = 1, psCvThreads = 10, createStudyPopThreads = 4, trimMatchStratifyThreads = 10,
fitOutcomeModelThreads = 4, outcomeCvThreads = 10)

krfeeney · July 9, 2020, 4:01pm

Hi @jenwilson521! Well you certainly are in the weeds of our methods library

First things first: what versions of the packages did you pull down? Can you share a snapshot of what’s in loaded library wise in your R environment?

One thing that’s not well documented on GitHub: the Methods group is migrating away from ffbase to Andromeda for how it stores and retrieves results. @msuchard and @schuemie know way more about this than I do but my general understanding is, there’s a few places we are working to tweak the CohortMethod library to adapt to this change. I’m not versed yet in what we’re doing here but perhaps they can help impart the tribal knowledge. In my experience it can definitely result in weird incomplete result sets being written to your CM data objects. I’m not sure if this is what’s happening for you but it’s good to double check you’ve got all the right versions of libraries so that’s not causing you additional issue.

Any chance you have a GitHub where I could look over your study package?

schuemie · July 10, 2020, 3:53am

The program wouldn’t complete unless the files were created, so they should be in the outputFolder you specified. Note that all paths in the tibble are relative to the outputFolder.

jenwilson521 · July 13, 2020, 9:16pm

Hi @krfeeney and @schuemie,

Thank you both for your comments. I had been following the transition to Andromeda, and was excited to try it out!

As for the file paths, I understand the outputs are relative to the outputFolder. For instance, the tibble lists the path Analysis_2/om_t5_c6_o29735.rds but the “Analysis_2” folder is empty.

In regards to package versions, here’s what’s in my R session:
other attached packages:
[1] SqlRender_1.6.6 CohortMethod_4.0.0 FeatureExtraction_3.0.0
[4] Andromeda_0.3.1 dplyr_1.0.0 Cyclops_3.0.0
[7] DatabaseConnector_3.0.0

loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 magrittr_1.5 splines_3.6.1 bit_1.1-15.2
[5] tidyselect_1.1.0 lattice_0.20-41 R6_2.4.1 rlang_0.4.6
[9] blob_1.2.1 grid_3.6.1 DBI_1.1.0 ellipsis_0.3.1
[13] digest_0.6.25 survival_3.2-3 bit64_0.9-7 tibble_3.0.1
[17] lifecycle_0.2.0 crayon_1.3.4 Matrix_1.2-18 rJava_0.9-12
[21] purrr_0.3.4 vctrs_0.3.1 memoise_1.1.0 glue_1.4.1
[25] RSQLite_2.2.0 compiler_3.6.1 pillar_1.4.4 generics_0.0.2
[29] pkgconfig_2.0.3

I’m currently setting the code up to run again so I can save the exact output from the analysis. I’m sorry I didn’t save that sooner when I realized it had completed without fully saving the results.

Thanks again for your help,

krfeeney · July 17, 2020, 9:49pm

Boom! That’s awesome. You’ll have to give us feedback once this runs to completion.

Are both analyses empty? Or just Analysis 2?

Any luck on the re-run?