Issues running HowOften study

jmp579 · December 18, 2023, 9:17pm

Hi All,

My site is attempting to participate in the HowOften (GitHub - ohdsi-studies/HowOften: [under development] Large-scale incidence characterization) study. We have ~1.4 million patients in our OMOP database and are using an Oracle (v19.21.0) backend. When I attempt to run the first portion of the study in StrategusCodeToRun.R (# Step 1 : Execute Azza Analysis), it runs for ~11 hours (after the package installation phase) and eventually fails with the following error: java.sql.SQLException: ORA-12801: error signaled in parallel query server P007; ORA-01555: snapshot too old: rollback segment number 12 with name “_SYSSMU12_1155298549$” too small. I’ve also attached the errorReportSql_1.txt that is generated.

From here, we thought that maybe our database server needs more resources (CPU cores – 8, Memory 64GB; Oracle SGA – 32GB, PGA – 20GB; UNDO tablespace size – max 200GB), so we randomly sampled 5% of our patients and created a reduced OMOP CDM. When pointing to this reduced database of ~70,000 patients, step 1 of StrategusCodeToRun.R (Execute Azza Analysis) runs for ~7 hours before failing with the following error: java.sql.SQLSyntaxErrorException: ORA-00936: missing expression (see attached errorReportSql_2.txt). It appears the sql query is not valid for Oracle sql. From this test we have two immediate questions:

Has anyone successfully run HowOften with an OMOP CDM instance that uses an Oracle backend? - It seems that perhaps the automatically generated sql query may not be compatible with Oracle.
Is there something we should be doing to optimize our OMOP CDM database? - Indexing perhaps?

At the OHDSI Symposium earlier this year it was mentioned that step-1 should only take a few of hours to run (depending on several factors). With that in mind, our instance seems to be severely under-performing as compared to the rest of the community.

We welcome any advice or shared experiences that could help us resolve these issues.

errorReportSql_2.txt (553 Bytes)
errorReportSql_1.txt (37.9 KB)

Best,
Justin

hripcsa · December 19, 2023, 2:00pm

Hi. Just to acknowledge your message, but I don’t have an answer to the questions.

Chris_Knoll · December 19, 2023, 2:41pm

I haven’t heard of anyone running on Oracle, but looking at error_2:

SELECT CAST('-1946488281' as VARCHAR(255)) as database_id, * FROM jpetucci.howoften_cohort_inclusion_result

That looks like valid SQL to me, can you run that statement (maybe switch it to FROM DUAL) in your own env, and let me know if it is an error?

In addition, there’s only 3 cohort definitions (I believe for Azza Study) but there is an additional ‘general population’ cohort which queries from Observation Period which builds a ‘calendar cohort’ to perform, but it’s not clear from the message if it was working on that cohort.

What I can do is I can pull out the oracle sql of each of those cohorts so you can run them individually directly, and you can do some profiling on your environment to see where the slow down is. Could be something related to indexing, or sort order of tables that is causing the queries to be slow.

jmp579 · December 21, 2023, 1:27pm

Hi Chris,

Thanks for your response.

My understanding is that the SQL query is valid in other flavors of SQL, just not Oracle. Here is an old stackoverflow: sql - How to SELECT Column, * FROM TABLE in Oracle? - Stack Overflow .

From the above link, the fix to make the query “Oracle compatible” is simple. Can you direct me to where I can find the SQL queries you referenced, or even better where I could perhaps make a change in Strategus so this automatically generated query works with Oracle (please forgive my ignorance as I need to get up to speed with Strategus based studies)?

Best,
Justin

Chris_Knoll · December 21, 2023, 5:04pm

I believe it’s done in CohortGenerator, and @anthonysena may be able to point you to it.

anthonysena · December 21, 2023, 5:37pm

Hi - yes, this is a bug in CohortGenerator for Oracle that is documented here: CohortStats export fails on Oracle · Issue #96 · OHDSI/CohortGenerator · GitHub. I can attempt to patch this in a preview release to see if that helps overcome this problem @jmp579

jmp579 · January 10, 2024, 8:14pm

Thanks @anthonysena , I made the suggested modification (fix - CohortStats export fails on Oracle #96 · jpetucci/CohortGenerator@3b1ad5d · GitHub) in my fork which took care of it.

With that out of the way, I am still working to understand why my runtime is so long. Are there any suggestions regarding indexing, sorting etc. along the lines of what @Chris_Knoll mentioned?

anthonysena · January 10, 2024, 8:40pm

Glad you were able to fork and test that fix @jmp579!

It might be useful to inspect the Strategus output as it will contain timing information for each cohort generated. Specifically, in your results folder, you should have a “CohortGeneratorModule_1” folder (if the trailing integer on the folder name is different, its fine) and within that folder there is a file called “cg_cohort_generation.csv”. In this file, you should see the start/end times for each cohort. If you can see which cohort(s) take the longest, this may help us to help you see where indices may help with the performance.

jmp579 · January 10, 2024, 9:22pm

Here are the contents of the cg_cohort_generation.csv file:

cohort_id,cohort_name,generation_status,start_time,end_time,database_id
862,"Earliest event of Alzheimer's disease derived from Imfeld, 2",COMPLETE,2024-01-09T18:38:33Z,2024-01-09T21:58:30Z,892102928
863,"Cognitive impairment, incident",COMPLETE,2024-01-09T21:58:30Z,2024-01-09T21:58:37Z,892102928
1071,persons at risk at start of year 2012-2022 with 365d prior observation,COMPLETE,2024-01-09T21:58:38Z,2024-01-10T01:01:53Z,892102928
863001,"Cognitive impairment, incident",COMPLETE,2024-01-10T01:01:53Z,2024-01-10T01:01:53Z,892102928

It looks like 862 and 1071 are taking the most time.

jmp579 · January 30, 2024, 3:42pm

We tweaked some oracle configuration parameters:

alter session set optimizer_index_cost_adj=1;
alter session set db_file_multiblock_read_count =2;

which reduced the runtime of the azza study to 2.634 hours for our full patient population. My understanding is that by inspecting the Oracle explain plan, it became apparent that indexing was not being used. The above parameters make the use of indexing more favorable.

Now, we are bumping up against another Oracle specific problem. When running later components of the HowOften study (I believe evan), we receive the following error (full error log attached):

 java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000

From what I gather, the cohort definition just contains too many codes (>1000) for Oracle. There doesn’t seem to be a configuration setting to increase the expression limit, so I need to fix this in the SQL query. Can someone (maybe @anthonysena) direct me to the portion of the code that generates this SQL?

Best,
Justin
errorReportSql.txt (14.2 KB)

Chris_Knoll · January 31, 2024, 3:36pm

Hi, Justin,

I looked at the error and I see where it is coming from: the cohort definition is using a large set of concept IDs which is running into that limit,

concept_id in (2002187,2729943,2730416,2729923,2730209,2730169,2730456,2729963,2730436,2730189,2730476,2729949,2730422,2729929,2730215,2730175,2730462,2729969,2730442,2730195,2730668,2729952,2730425,2729932,2730218,2730178,2730465,2729972,2730445,2730198,2730671,2729946,2730419,2729926,2730212,...

We;d need to modify the cohort generation SQL to break up those IN clauses into sets of 1000. Oracle has the notion of ‘tuples’ which can be used in an in clause without the 1000 limit, Alternatively, there may be a way around it with SqlRender and using an oracle function sys.odcinumberlist which would allow us to pass arbitrary length of params to an in clause.

jmp579 · February 5, 2024, 4:47pm

Thanks @Chris_Knoll , Can you direct me to the location in the cohort generation package that needs modified ? I can iterate on my local fork and work on a generalized solution that is compatible with any database backend.

Chris_Knoll · February 5, 2024, 6:26pm

Not quite that simple: the Strategus package accepts a CohortSet as an input, and even tho the CohortSet contains the SQL for cohort definitions, Strategus forces the JSON to be converted into SQL…so the fix will involve making Circe (a dependency) generate the correct SQL from JSON…so, an update to Circe is required (which I can work on), but then the CohortGenerator package will need to update it’s reference to the new version of CirceR, and then the Strategus module will need to be updated to reference the new version of CohortGeneratorModule (which will refrence the updated CohortGenerator package).

So, it’s ‘doable’ to do all this on your own, but complicated.

jmp579 · February 7, 2024, 3:40pm

Thanks for the clarification as I didn’t realize there were so many “layers” all the way back to Circe to fix this properly. I’m happy to help and test where appropriate.

jmp579 · January 30, 2025, 7:37pm

We’re ready to submit results for the HowOften study. Where can I find the values for server, user, and password required to establish a connection in the table creation and results upload scripts?

  server = Sys.getenv("OHDSI_RESULTS_DB"),
  user = Sys.getenv("OHDSI_HO_USER"),
  password = Sys.getenv("OHDSI_HO_PASSWORD")