OHDSI Home | Forums | Wiki | Github

How do you pinpoint the SQL error in R error reports?

I am a R newbie. Very often we are faced with error like this

The Error log says

org.postgresql.util.PSQLException: ERROR: UNION types integer and character varying cannot be matched
  Position: 325

Honestly this is not quite helpful as there are hundreds of 1000-line SQL files in a sizeable projects.

And sometimes it is hard to navigate through the error log tree

Stack trace:
14: stop(fallback)
13: signal_abort(cnd)
12: abort(paste("Error executing SQL:", message, paste("An error report has been created at ", f
11: .createErrorReport(connection@dbms, err$message, sqlStatement, errorReportFile)
10: value[[3]](cond)
9: tryCatchOne(expr, names, parentenv, handlers[[1]])
8: tryCatchList(expr, classes, parentenv, handlers)
7: tryCatch({
    startQuery <- Sys.time()
    lowLevelExecuteSql(connection, sqlStatement)
6: DatabaseConnector::executeSql(connection, sql)
5: createConceptCountsTable(connection = connection, cdmDatabaseSchema = cdmDatabaseSchema, tem
4: runConceptSetDiagnostics(connection = connection, tempEmulationSchema = tempEmulationSchema,
3: CohortDiagnostics.R#30: CohortDiagnostics::runCohortDiagnostics(packageName = "LegendT2dm", 
2: Assessment.R#103: runExposureCohortDiagnostics(connectionDetails, cdmDatabaseSchema, vocabul
1: assessPhenotypes(connectionDetails = connectionDetails, cdmDatabaseSchema = cdmDatabaseSchem

Generally, how do you pinpoint down the error SQL in OHDSI setting?

And maybe can we have add to the wishlist that the error log would also split out the SQL file it has last parsed?


Usually there’s also a SQL error report generated that contains the exact SQL statement that caused the error?

Also, make sure you use the CDM DDL for your SQL dialect, and don’t make any changes to that. If the data does not conform to the CDM specs, all bets are off.

Ah. We used to not know a lot about OHDSI. We used to a external vendor to convert our data. They put some string in the drug_source_concept_id field like the icd9 codes. We thought it was okay to change the source_id column from integer to VARCHAR since it is not encouraged to run analysis on that column.

Until we run the LegendT2DM scripts, it was quite obvious that that was not okay.

And lately we thought that not filling in cdm_source might also cause problems.

OHDSI is hard!

Yes, it is! But it can be very rewarding too!

Here are the DDL files for your platform.

yes thank you very much. another lesson we had back then was that one can select the cdm version in the github repository.

back then someone posted a postgresql 6.0 cdm ddl link for us on the forums, and we thought it was already ok. but it turns out most studies should be running on 5.3x and we have to rollback.it was not backward compatible.

yes, the SQL error report is something like this.


org.postgresql.util.PSQLException: ERROR: syntax error at or near "#"
  Position: 15

SELECT * FROM #exposure_cohorts;

R version:
R version 4.0.5 (2021-03-31)


Attached base packages:
- stats
- graphics
- grDevices
- datasets
- utils
- methods
- base

Other attached packages:
- LegendT2dm (0.0.1)
- DatabaseConnector (4.0.2)

This SQL is a one-liner and somehow we do not know where to start.

And we are not quite sure if SELECT * FROM #exposure_cohorts; is a correct syntax for postgresql (should be)

Most importantly, this SQLerror report contradicts the error in R showing that org.postgresql.util.PSQLException: ERROR: UNION types integer and character varying cannot be matched Position: 325

Then were is the union in the above one-liner?

Maybe it was that I was highlighting all the lines in the R script and so it was some function in the later stage, and the previous SQL error report has been overwritten.