Import .zip file into R method

Vojtech_Huser · February 13, 2020, 4:37pm

I just want to report warnings I saw when running the IUD study (the skeleton code seems to be using older version of sql render, I think).
It is warning, so I am not sure if it blocking the execution or not.

zip file extracted on Feb 13 2020 from http://atlas-demo.ohdsi.org/#/estimation/cca/262




> execute(connectionDetails = connectionDetails,
+         cdmDatabaseSchema = cdmDatabaseSchema,
+         cohortDatabaseSchema = cohortDatabaseSchema,
+         cohortTable = cohortTable,
+         oracleTempSchema = oracleTempSchema,
+         outputFolder = outputFolder,
+         databaseId = databaseId,
+         databaseName = databaseName,
+         databaseDescription = databaseDescription,
+         createCohorts = TRUE,
+         synthesizePositiveControls = TRUE,
+         runAnalyses = TRUE,
+         runDiagnostics = TRUE,
+         packageResults = TRUE,
+         maxCores = maxCores)
Creating exposure and outcome cohorts
Connecting using PostgreSQL driver
Creating cohort: CuIUD
  |=============================================================================| 100%
Executing SQL took 0.35 secs
Creating cohort: LNGIUS
  |=============================================================================| 100%
Executing SQL took 0.146 secs
Creating cohort: Alt_High_Grade_Cervical_Neoplasm
  |=============================================================================| 100%
Executing SQL took 0.223 secs
Warning: This function is deprecated. Use 'render' instead.
Warning: This function is deprecated. Use 'translate' instead.
Creating negative control outcome cohorts
  |=============================================================================| 100%
Executing SQL took 0.011 secs
Counting cohorts
Running CohortMethod analyses
*** Creating cohortMethodData objects ***
  |=============================================================================| 100%
*** Creating study populations ***
  |=============================================================================| 100%
*** Fitting shared propensity score models ***
Loading required package: CohortMethod
Loading required package: Cyclops
Loading required package: FeatureExtraction
Fitting propensity model across all outcomes (ignore messages about 'no outcome specified')
Warning: The addExposureDaysToStart argument is deprecated. Please use the startAnchor argument instead.
Warning: The addExposureDaysToEnd argument is deprecated. Please use the endAnchor argument instead.
No outcome specified so skipping removing people with prior outcomes
Removing subjects with less than 1 day(s) at risk (if any)
No outcome specified so not creating outcome and time variables
Creating propensity scores took 0 secs
Fitting propensity model across all outcomes (ignore messages about 'no outcome specified')
Warning: The addExposureDaysToStart argument is deprecated. Please use the startAnchor argument instead.
Warning: The addExposureDaysToEnd argument is deprecated. Please use the endAnchor argument instead.
No outcome specified so skipping removing people with prior outcomes
Removing subjects with less than 1 day(s) at risk (if any)
No outcome specified so not creating outcome and time variables
Creating propensity scores took 0 secs
Fitting propensity model across all outcomes (ignore messages about 'no outcome specified')
Warning: The addExposureDaysToStart argument is deprecated. Please use the startAnchor argument instead.
Warning: The addExposureDaysToEnd argument is deprecated. Please use the endAnchor argument instead.
No outcome specified so skipping removing people with prior outcomes
Removing subjects with less than 1 day(s) at risk (if any)
No outcome specified so not creating outcome and time variables
Creating propensity scores took 0 secs
*** Adding propensity scores to study population objects ***
  |=============================================================================| 100%
*** Trimming/Matching/Stratifying ***
  |=============================================================================| 100%
*** Prefiltering covariates for outcome models ***
*** Fitting outcome models for outcomes of interest ***
  |                                                                             |   0%Thread 2 returns error: "Requested stratified analysis, but no stratumId column found in population. Please use matchOnPs or stratifyByPs to create strata." when using argument(s): list(cohortMethodDataFolder = "c:/temp/iud/cmOutput/CmData_l1_t1771647_c1771648", prefilteredCovariatesFolder = "", args = list(excludeCovariateIds = list(), useCovariates = FALSE, prior = list(variance = 1, useCrossValidation = TRUE, priorType = "laplace", exclude = 0, neighborhood = NULL, forceIntercept = FALSE, graph = NULL), inversePtWeighting = FALSE, control = list(maxIterations = 1000, autoSearch = TRUE, seed = NULL, initialBound = 2, gridSteps = 10, threads = 4, startingVariance = 0.01, useKKTSwindle = FALSE, n    lowerLimit = 0.01, cvRepetitions = 10, noiseLevel = "quiet", fold = 10, minCVData = 100, resetCoefficients = FALSE, upperLimit = 20, cvType = "auto", selectorType = "auto", convergenceType = "gradient", tuneSwindle = 10, maxBoundCount = 5, tolerance = 2e-07, algorithm = "ccd"), modelType = "cox", stratified = TRUE, interactionCovariateIds = list(), includeCovariateIds = list()), studyPopFile = "c:/temp/iud/cmOutput/StratPop_l1_s2_p1_t1771647_c1771648_s2_o1771054.rds", outcomeModelFile = "c:/temp/iud/cmOutput/Analysis_2/om_t1771647_c1771648_o1771054.rds")
  |=============================================================================| 100%
Error in ParallelLogger::clusterApply(cluster, modelsToFit, doFitOutcomeModel) : 
  Error(s) when calling function 'fun', see earlier messages for details

Vojtech_Huser · February 13, 2020, 4:42pm

I am also suggesting to make code to run to use more environent variables

e.g.,

connectionDetails<-createConnectionDetails(dbms=Sys.getenv('dbms')
                                           ,user=Sys.getenv('user'),password=Sys.getenv('password')
                                           ,server=Sys.getenv('server')
                                           ,schema = schema)

Full current file also uses S drive (maybe C would be better)

library(iud)

# Optional: specify where the temporary files (used by the ff package) will be created:
options(fftempdir = "s:/FFtemp")

# Maximum number of cores to be used:
maxCores <- parallel::detectCores()

# The folder where the study intermediate and result files will be written:
outputFolder <- "s:/iud"

# Details for connecting to the server:
connectionDetails <- DatabaseConnector::createConnectionDetails(dbms = "pdw",
                                                                server = Sys.getenv("PDW_SERVER"),
                                                                user = NULL,
                                                                password = NULL,
                                                                port = Sys.getenv("PDW_PORT"))

# The name of the database schema where the CDM data can be found:
cdmDatabaseSchema <- "cdm_truven_mdcd_v699.dbo"

# The name of the database schema and table where the study-specific cohorts will be instantiated:
cohortDatabaseSchema <- "scratch.dbo"
cohortTable <- "mschuemi_skeleton"

# Some meta-information that will be used by the export function:
databaseId <- "Synpuf"
databaseName <- "Medicare Claims Synthetic Public Use Files (SynPUFs)"
databaseDescription <- "Medicare Claims Synthetic Public Use Files (SynPUFs) were created to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy. These files are intended to promote development of software and applications that utilize files in this format, train researchers on the use and complexities of Centers for Medicare and Medicaid Services (CMS) claims, and support safe data mining innovations. The SynPUFs were created by combining randomized information from multiple unique beneficiaries and changing variable values. This randomization and combining of beneficiary information ensures privacy of health information."

# For Oracle: define a schema that can be used to emulate temp tables:
oracleTempSchema <- NULL

execute(connectionDetails = connectionDetails,
        cdmDatabaseSchema = cdmDatabaseSchema,
        cohortDatabaseSchema = cohortDatabaseSchema,
        cohortTable = cohortTable,
        oracleTempSchema = oracleTempSchema,
        outputFolder = outputFolder,
        databaseId = databaseId,
        databaseName = databaseName,
        databaseDescription = databaseDescription,
        createCohorts = TRUE,
        synthesizePositiveControls = TRUE,
        runAnalyses = TRUE,
        runDiagnostics = TRUE,
        packageResults = TRUE,
        maxCores = maxCores)

resultsZipFile <- file.path(outputFolder, "export", paste0("Results", databaseId, ".zip"))
dataFolder <- file.path(outputFolder, "shinyData")

prepareForEvidenceExplorer(resultsZipFile = resultsZipFile, dataFolder = dataFolder)

launchEvidenceExplorer(dataFolder = dataFolder, blind = TRUE, launch.browser = FALSE)

I also had fresh install of R so I documented my steps of running the .zip package

unzip z and open the R project file
do all in text below

#do rstudio, r, rtools done see https://ohdsi.github.io/TheBookOfOhdsi/OhdsiAnalyticsTools.html#installR
#i only did 64bit for Java

install.packages("SqlRender")
library(SqlRender)
translate("SELECT TOP 10 * FROM person;", "postgresql")

install.packages("drat")
drat::addRepo("OHDSI")
install.packages("CohortMethod")

#if you see error
#ERROR: loading failed for 'i386'
#* removing 'C:/q/d/R/R-3.6.2/library/FeatureExtraction'

#repeat step install cohort method but add additional argument to ignore 32bit architecture
install.packages("CohortMethod",INSTALL_opts=c("--no-multiarch"))

#load the package (not in book)
library(CohortMethod)

#check it
 sessionInfo()  
 #or better formatted via this (must first have devtools package installed) 
 devtools::session_info()
 
 
 #to compile the package - I also needed the following packages
 install.packages("OhdsiSharing",INSTALL_opts=c("--no-multiarch"))
 install.packages("MethodEvaluation",INSTALL_opts=c("--no-multiarch"))
 install.packages("EmpiricalCalibration",INSTALL_opts=c("--no-multiarch"))

 #clicking build still generated error
 #however clicking 'install and restart' in Build tab  worked and command library(chosen-name) ran 
 # with no errors
 
 #if I run check, it fails for x386
 
 #go to extras and file CodeToRun.R
 #study ran (but had errors)

krfeeney · February 13, 2020, 10:44pm

Vojtech, I’ve run this package too. Yes. Some of this has to do with newer versions of PLE being available than what Atlas can spit out. I haven’t checked to see where this is in the Atlas issue log but I’m sure this is known. The depreciation of SQLRender is largely noise. It can run without issue.

Vojtech_Huser:

Full current file also uses S drive (maybe C would be better)

library(iud)

# Optional: specify where the temporary files (used by the ff package) will be created:
options(fftempdir = "s:/FFtemp")

# Maximum number of cores to be used:
maxCores <- parallel::detectCores()

# The folder where the study intermediate and result files will be written:
outputFolder <- "s:/iud"

In any network package, references to where you store your local results files is a setting entirely at your discretion. Nobody can mandate how your system is organized or where you want to put your results.

Vojtech_Huser · February 14, 2020, 1:58pm

Where on Github source code for the .zip file lives? Let’s say I want to submit github issue to the right repo. Do I submit it under Atlas? Taging @schuemie since the name of the cohort table hints and who wrote it…

schuemie · February 14, 2020, 3:17pm

This would be the place to file issues