No result for plausibleUnitConceptIds query after 40 hours

Jeff_Stroup · October 10, 2024, 2:03pm

DQDers:

I was searching the OHDSI forums for any information on improving our worse-than-slow plausibleUnitConceptIds queries and this post mentions Running the DQD in SqlOnly mode link where Katy Sadowski mentions a configuration in which sqlOnlyIncrementalInsert = TRUE and sqlOnlyUnionCount > 1 in which multiple checks are unioned within a cte in the output SQL query to speed performance.

Sadly combining queries can’t help with query performance when simply running the very first of these queries doesn’t return a result after FORTY HOURS!

I’ve worked with our DBA to attempt all of the following optimization approaches…

Plausible Unit Concept IDs Query Optimization

Although the measurement table I’m querying has 1,303,817,579 rows, I can’t fathom why any one of these queries takes so ridiculously long. And I can’t imagine that we are the only ones affected by this? Surely other folks have billion+ row measurement tables, right? I’m tossing around the idea of using bulk-transfer methods (which are up to 10 times faster at copying/translating data) or batch processing in chunks (which is how we ingest our HIE data), but since this is just a “simple” SELECT statement, it doesn’t lend itself to bulk transfer or processing in batches.

Until this is solved, the plausibleUnitConceptIds data check will remain blank on all report submissions.

Hopefully someone else is dealing with the same or similar issue.

Thanks,
Jeff

katy-sadowski · October 10, 2024, 11:31pm

Hi @Jeff_Stroup , what database system are you using? (I can’t access the links you’ve shared.)

On a properly configured modern data warehouse this runtime is unexpected. If you’re using Postgres for that large a dataset, however, some wizardry will probably be required (and DQD will be only the beginning of your woes) I’m not an expert in this topic unfortunately but I’m sure others in the community have experience/knowledge to share.

Also, what version of DQD are you running? Recent versions greatly slimmed down the plausibleUnit check to check far fewer units with more vetted sets of plausible concept IDs. So if you’re on an older version, upgrading to the latest will greatly reduce the scope of this check.