I have been trying to run R scripts to faciliate OHDSI researches. However, I have discovered that while some test runs (let’s say Achilles scripts), RStudio could freeze when running longer scripts.
For example, test 200 in Achilles
--TEST CODE 200 --HINT DISTRIBUTE_ON_KEY(stratum_1) CREATE TEMP TABLE s_tmpach_200 AS SELECT 200 AS analysis_id, CAST(vo.visit_concept_id AS VARCHAR(255)) AS stratum_1, CAST(NULL AS VARCHAR(255)) AS stratum_2, CAST(NULL AS VARCHAR(255)) AS stratum_3, CAST(NULL AS VARCHAR(255)) AS stratum_4, CAST(NULL AS VARCHAR(255)) AS stratum_5, COUNT(DISTINCT vo.person_id) AS count_value FROM ohdsiv5_june2021_thin.visit_occurrence vo JOIN ohdsiv5_june2021_thin.observation_period op ON vo.person_id = op.person_id AND vo.visit_start_date >= op.observation_period_start_date AND vo.visit_start_date <= op.observation_period_end_date GROUP BY vo.visit_concept_id
would return no result in R studio.
However, if I put it in Python postreSQL runner with some settings
conn = psycopg2.connect(host=“someserver.hk”, port=12345, dbname=“ohdsi”, user=“admin”, password=“admin1”, options="-c search_path="+schema, # it seems the below lines are needed to keep the connection alive. connect_timeout=10, keepalives=1, keepalives_idle=5, keepalives_interval=2, keepalives_count=5)
The concerned script #200 would seem to finish running successfully. The time needed to run the script is about 50 minutes.
C:\.....Programs\Python\Python39\python.exe "D:/OHDSI-THIN/scripts/import_from_THIN/test query 200.py" 10:37:43.539193 running script... 11:27:45.922109 finished Process finished with exit code 0
I am testing another script Achilles #117 that got stuck, as of now it is still running in Python. (2+ hours run time)
--TEST CODE 117 --HINT DISTRIBUTE_ON_KEY(stratum_1) -- generating date key sequences in a cross-dialect compatible fashion CREATE TEMP TABLE s_tmpach_117 AS WITH century AS (SELECT CAST('19' as TEXT) num union select '20' num), tens as (select '0' num union select '1' num union select '2' num union select '3' num union select '4' num union select '5' num union select '6' num union select '7' num union select '8' num union select '9' num), ones as (select '0' num union select '1' num union select '2' num union select '3' num union select '4' num union select '5' num union select '6' num union select '7' num union select '8' num union select '9' num), months as (select '01' as num union select '02' num union select '03' num union select '04' num union select '05' num union select '06' num union select '07' num union select '08' num union select '09' num union select '10' num union select '11' num union select '12' num), date_keys as (select cast(concat(century.num, tens.num, ones.num,months.num) as int) obs_month from century cross join tens cross join ones cross join months) SELECT 117 as analysis_id, CAST(t1.obs_month AS VARCHAR(255)) as stratum_1, cast(null as varchar(255)) as stratum_2, cast(null as varchar(255)) as stratum_3, cast(null as varchar(255)) as stratum_4, cast(null as varchar(255)) as stratum_5, COALESCE(COUNT(distinct op1.PERSON_ID),0) as count_value FROM date_keys t1 left join (select t2.obs_month, op2.* from ohdsiv5_june2021_thin.observation_period op2, date_keys t2 where EXTRACT(YEAR FROM op2.observation_period_start_date)*100 + EXTRACT(MONTH FROM op2.observation_period_start_date) <= t2.obs_month and EXTRACT(YEAR FROM op2.observation_period_end_date)*100 + EXTRACT(MONTH FROM op2.observation_period_end_date) >= t2.obs_month ) op1 on op1.obs_month = t1.obs_month group by t1.obs_month
It had been run successfully yesterday night after I left work. Too bad I forgot to put a timer to that.
Just interested, if anyone has any experience that Rstudio would become unresponsive running long calculations. How did you solve this?
I try running the scripts in another IDE, but with very little success. I have tried getting R scripts running in vscode but setting up that is a pain in the butt. ty