At first glance, it appears that PatientLevelPrediction has not been updated to the new DatabaseConnector API. I’ll look into this when I get a chance … @schuemie will probably beat me to it.
Boo, you forced me to upgrade R. I was on R version 3.1.1 (2014-07-10). I used installr. Then I practically updated every package, including the following . . . but I’m still getting the error. It could just be an updating R issue.
Took me a bit to get it working, got further along but now I’m getting this:
Connecting using SQL Server driver using Windows integrated security
Connecting using SQL Server driver using Windows integrated security
Executing multiple queries. This could take a while
|=======================================================================================| 100%
Analysis took 3 secs
Fetching data from server
Error executing SQL: Error in setwd(dfile): cannot change working directory
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
The error report looks like this:
DBMS:
sql server
Error:
cannot change working directory
SQL:
SELECT subject_id AS person_id, cohort_start_date, cohort_concept_id, DATEDIFF(DAY, cohort_start_date, cohort_end_date) AS time FROM #cohort_person ORDER BY person_id, cohort_start_date
R version:
R version 3.2.1 (2015-06-18)
This looks like a known problem with the ff package. As you know, ff stores data on disk instead of keeping it in memory. At the start of an R session it creates a temp folder where the ff data objects live. But when you restart your R session while keeping your data environment (say, after a package rebuild), the temp folder is gone but ff is still pointing to it.
I always run this command before using anything related to ff:
options(fftempdir = "s:/temp")
This forces the ff temp folder to be the one I specified. I was hoping ‘end-users’ such as yourself would never need this hack, but I guess I was wrong. Do you have any idea how this could have happened? Have you hit the ‘Build’ button in R-Studio?
@schuemie - I was able to get further along. I’m currently running the getDbCovariateData() function. I will let you know if I run into additional issues.
And I get the following after running on a 100K sample (my full cohorts are 3M and 9M):
Connecting using SQL Server driver using Windows integrated security
Executing multiple queries. This could take a while
|=================================================================================================================================| 100%
Analysis took 3.64 hours
Fetching data from server
Error: 'dbGetQuery.ffdf' is not an exported object from 'namespace:DatabaseConnector'
>
I tried also adding in the connDetails$dbms but this gives me:
Connecting using SQL Server driver using Windows integrated security
Error in paste("jdbc:sqlserver://", server, ";integratedSecurity=true", :
argument "server" is missing, with no default
So then I tried connDetails$server and got:
Executing multiple queries. This could take a while
| | 0%Error executing SQL: Error in if (attr(connection, "dbms") == "redshift" & grepl("DROP TABLE IF EXISTS", : argument is of length zero
An error report has been created at \\glaz/Epi_GLAz/Projects/Programs/errorReport.txt
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
My guess is I don’t want $dbms or ‘$server’ but there is an issue with the ffdf thing coming out of getDbCovariateData. But I don’t understand why the 'namespace:DatabaseConnector' cares about the 'dbGetQuery.ffdf'.
I appreciate any thoughts you have and if you think it is user error let me know!
Connecting using SQL Server driver using Windows integrated security
Executing multiple queries. This could take a while
|=================================================================================================================================| 100%
Analysis took 0.146 secs
Fetching data from server
Loading took 0.0468 secs
Warning messages:
1: In lowLevelQuerySql.ffdf(connection, sql) :
Data has zero rows, returning an empty data frame
2: In getDbOutcomeData(connDetails, cdmDatabaseSchema = cdmDatabaseSchema, :
No outcome data found
I assume this program is looking for #cohort_outcome so tried rerunning my getDbCovariateData() just in case. Still get same error.
I thought maybe it was related to the issue above but if I understand that issue properly if you are using querySql.ffdf then you are fine - which to me it looks like it is.
Nope, it doesn’t use temp tables as input. The input cohorts and outcome cohort are all expected to exist in the table defined by the cohortTable variable. The outcome cohorts are expected to have ID 10. No outcomes are found for the people in the input cohorts, hence the warnings.
Could you manually check if this is correct? Something like
SELECT COUNT(*)
FROM cohortTable cohort
INNER JOIN cohortTable outcome
ON cohort.subject_id = outcome.subject_id
WHERE cohort.cohort_concept_id = 1
AND outcome.cohort_concept_id = 10
AND outcome.cohort_start_date >= cohort.cohort_start_date
AND outcome.cohort_start_date <= cohort.cohort_end_date;
where cohortTable is the name of your cohort table.
Looks like there’s a problem opening the ff objects. If you type
parts[[2]]$cohortData
parts[[2]]$covariateData
Do these two commands generate errors? Did you use options(fftempdir = "s:/temp") to prevent the ff temp folder from becoming invalid on an R-within-RStudio restart?