OHDSI Home | Forums | Wiki | Github

Txpath slow execution

Hello,

During our first runs of TxPath on a full dataset, we ran against the problem that the execution times are very long.
We are aware the TxPath tool is still under development and improvements are being made.

On a cohort of about 50k patients, the execution times are between 2 to 10 hours. We have used a set of ~20 often occurring ingredients (more than a third of the patients is expected to have used at least one of these drugs).
Database: Postgres.

Analysis of the queries showed that there is one query that takes a lot of time. This query processes data from the pnc_tmp_ptsq_ct table and inserts it into pnc_tmp_ptstg_ct. The description of this query is (runPanaceaStudy.sql, line 116):

-- insert from #_pnc_ptsq_ct ptsq into #_pnc_ptstg_ct (remove same patient/same drug small time window inside large time window. EX: 1/2/2015 ~ 1/31/2015 inside 1/1/2015 ~ 3/1/2015)
-- use single concept combo and avoid duplicate combo for the same concept if there's multiple single combo for same concept by min() value 

We suspect it has something to do with those temporary tables that have no indexes.

What were typical study execution times for the “Characterizing Treatment Pathways at Scale” study? And what database was used? (@jon_duke)
Does anyone have experience with running TxPath on Postgres?
Or does anyone have suggestions how to speed up the study execution?

Thanks!

TxPath had successful tests with Postgres and Oracle. SQL server should work but not fully tested. Run times were <15 mins typically for 100K persons with 15-20 medications. I have cc’d the developer at Regenstrief Chen Wen to comment further.

Separately, Txpath-lite has been submitted as a pull request to WebAPI (https://github.com/OHDSI/WebAPI/pull/238) so hopefully will be accepted soon. The UI (which still needs lots of love from the community) can be found here. https://github.com/OHDSI/txPath-lite. Once the pull request is accepted, you should be able to begin evolving TxPath-lite to meet your needs. It is much faster, if simpler.

ps. here is a screenshot

2 Likes
t