I’m setting up my OHDSI environment using Databricks…
When I run the script to create the Achilles database objects (created using the url shown below) I get the error shown below.
http://localhost:8080/WebAPI/ddl/results?dialect=spark&schema=<DB_NAME>_ach_res&vocabSchema=<DB_NAME>&tempSchema=<DB_NAME>_ach_tmp&initConceptHierarchy=true
An older version of the script is working. I’m not sure of the critical difference between the two files. I’m thinking it might be a CDM version that was used when the script was created, but I don’t see anything in the url that indicates what version of the CDM should be used to generate the script (is the version of the CDM to use to generate this script configured in WebAPI/Atlas somewhere?).
The working script, failing script, and full output are attached.
create-achilles-tables_WORKING.sql.txt (117.2 KB)
create-achilles-tables_DOES_NOT_WORK-ddl.sql.txt (115.6 KB)
output.txt (239.4 KB)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 92321.0 failed 4 times, most recent failure: Lost task 0.3 in stage 92321.0 (TID 140794) (10.22.225.217 executor 262): org.apache.spark.SparkNumberFormatException: [CAST_INVALID_INPUT] The value 'Monthly' of the type "STRING" cannot be cast to "INT" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 5, position 1) ==
INSERT OVERWRITE TABLE demo_cdm_ach_res.heracles_periods SELECT * FROM insertion_temp
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
at org.apache.spark.sql.errors.QueryExecutionErrors$.invalidInputInCastToNumberError(QueryExecutionErrors.scala:256)
at org.apache.spark.sql.catalyst.util.UTF8StringUtils$.withException(UTF8StringUtils.scala:51)
— EDIT ----------------------------------
Running this (as described in the error message) before executing the script did not fix the problem:
set spark.sql.ansi.enabled=false