OHDSI Home | Forums | Wiki | Github

Unexpected error for high correlation between covariates and treatment

(Matt Spotnitz) #1


I am running code for my study based on code output from population level estimation (PLE). I got the following error, which was unexpected and stops the execution of the code:

Error in createPs(stopOnError = TRUE, excludeCovariateIds = list(), prior = list( :
High correlation between covariate(s) and treatment detected. Perhaps you forgot to exclude part of the exposure definition from the covariates?

Can you please tell me how to work around this error? What might be the cause?



(Seng Chan You) #2

The third part of PLE tutorial (led by @jweave17 ) from minute 33 might be helpful.

(Martijn Schuemie) #3

As the error message says, there are covariates that are highly correlated with the treatment (target or comparator). The error message should also list which covariates those are.

The propensity model uses a large set of covariates as predictors to predict the treatment assignment (target or comparator). By default the set of covariates includes all drugs, conditions, procedures, etc, on and before the index date (the date of treatment initiation). That means that if you’re not careful, you can accidentally include the treatments itself in the covariates (since the index date is also included), and that leads to a perfectly predictable (and perfectly useless) model. You must explicitly specify these must be excluded.

This is actually explained in the tutorial (3rd part) at minute 21.

(Matt Spotnitz) #4

Thanks for the helpful suggestions! I have made a concept set of the covariates listed with the error message and indicated that they are baseline covariates to be excluded from the propensity score model. After downloading a new package, replacing the files, rebuilding the package and running the code, I get the identical error message. The same covariates are listed as having a high correlation, as if I never excluded them. Is there something else I should be doing? Thanks!

(Matt Spotnitz) #5

I have made it through the correlation between covariates and treatment error! To resolve this issue, I made concept sets out of the covariates listed as having high correlation with the treatments, and excluded them both in the comparison and treatment elements of the PLE. Thank you everyone for helping me get this far! Now, I am having trouble producing the shiny data. I get the following errors:

Thread 2 returns error: “replacement has 1 row, data has 0” when using argument(s): list(analysisId = integer(0), outcomeId = numeric(0), comparatorId = integer(0), targetId = integer(0), analysisDescription = integer(0), outcomeName = integer(0), comparatorName = integer(0), targetName = integer(0), includedCovariateConceptIds = integer(0), excludedCovariateConceptIds = integer(0), outcomeOfInterest = logical(0), cohortMethodDataFolder = character(0), studyPopFile = character(0), sharedPsFile = character(0), psFile = character(0), strataFile = character(0), prefilteredCovariatesFolder = character(0), n outcomeModelFile = character(0), logRr = numeric(0), seLogRr = numeric(0)),list(targetId = c(1770506…

Error in ParallelLogger::clusterApply(cluster = cluster, x = subsets, :
Error(s) when calling function ‘fun’, see earlier messages for details

Does anyone know how to resolve these issues? Thanks!

(Seng Chan You) #6

@mattspotnitz Could you open the file named analysisSummary.csv in your outputFolder and check whether it has the valid result?

(Matt Spotnitz) #7

Yes, I was able to do so.

(Matt Spotnitz) #8

There were valid results in the output folder.

(Seng Chan You) #9

Though I’m not sure which code you activated before the error message,
please make sure to activate ‘prepareForEvidenceExplorer’ function before launching. Then re-build the package, and then activate the ‘launchEvidenceExplorer’. @mattspotnitz

(Matt Spotnitz) #10

Thanks for the suggestions. When I activated the function and rebuilt the package, I got an error that the file databse.csv could not be found in the zip file. I noticed that it is not in the output directory. Do you have any suggestions on how to locate the file or resolve this issue? Here is the error message. Thanks!

Cannot find file database.csv in zip file

(Matt Spotnitz) #11

For clarity, the original error message on this post was resolved by specifying which concepts to exclude as baseline covariates in the propensity score model. The conversation then shifted to a discussion about debugging errors associated with producing shiny data.