We also have Don’s Perl version doing the iterations, if that’s any help.
C
From: Patrick Ryan [mailto:notifications@mail132-12.atl131.mandrillapp.com] On Behalf Of Patrick Ryan
Sent: Monday, November 24, 2014 6:22 PM
To: reich@ohdsi.org
Subject: [OHDSI Forums] [Implementers] Populating *_era tables?
Patrick_Ryan
November 24
Hi Ryan: There's been several implementations of era builders. Originally, OMOP provided a tool called GERALD that provided SAS and Oracle SQL solutions. The code is available at: http://omop.org/cdm
I've done it myself in TSQL script with cursors, but found it wasn't super fast with big data, so wouldn't recommend that approach. Chris came up with a very clever approach to building eras in pure SQL, here's a snippet below for drug era, assuming you're working with OMOP CDM v4 in SQL Server and have pharmacy claims with days supply to infer end dates for each exposure record:
with cteDrugTarget (DRUG_EXPOSURE_ID, PERSON_ID, DRUG_CONCEPT_ID, DRUG_EXPOSURE_START_DATE, DRUG_EXPOSURE_END_DATE, DRUG_TYPE_CONCEPT_ID, REFILLS, QUANTITY, DAYS_SUPPLY, RowNum) as(-- Normalize DRUG_EXPOSURE_END_DATE to either the existing drug exposure end date, or add days supply, or add 1 day to the start dateselect d.DRUG_EXPOSURE_ID, d. PERSON_ID, c1.concept_id as DRUG_CONCEPT_ID, d.DRUG_EXPOSURE_START_DATE, COALESCE(d.DRUG_EXPOSURE_END_DATE, DATEADD(day,d.DAYS_SUPPLY,d.DRUG_EXPOSURE_START_DATE), DATEADD(day,1,DRUG_EXPOSURE_START_DATE)) as DRUG_EXPOSURE_END_DATE, d.DRUG_TYPE_CONCEPT_ID, d.REFILLS, d.QUANTITY, d.DAYS_SUPPLY,ROW_NUMBER() OVER (PARTITION BY d.PERSON_ID, c1.concept_id ORDER BY d.DRUG_EXPOSURE_START_DATE) as RowNumFROM CDM4_SIM.dbo.DRUG_EXPOSURE dinner join CONCEPT_ANCESTOR ca on d.DRUG_CONCEPT_ID = ca.DESCENDANT_CONCEPT_IDinner join concept c1 on ca.ANCESTOR_CONCEPT_ID = c1.concept_id and c1.VOCABULARY_ID = 8 and c1.concept_level = 2),cteEndDates (PERSON_ID, END_DATE) as(select PERSON_ID, DATEADD(day,-30,EVENT_DATE) as END_DATE -- unpad the end dateFROM(select PERSON_ID, DRUG_CONCEPT_ID, EVENT_DATE, EVENT_TYPE, MAX(START_ORDINAL) OVER (PARTITION BY PERSON_ID, DRUG_CONCEPT_ID ORDER BY EVENT_DATE, EVENT_TYPE ROWS UNBOUNDED PRECEDING) as START_ORDINAL, -- this pulls the current START down from the prior rows so that the NULLs from the END DATES will contain a value we can compare with ROW_NUMBER() OVER (PARTITION BY PERSON_ID, DRUG_CONCEPT_ID ORDER BY EVENT_DATE, EVENT_TYPE) AS OVERALL_ORD -- this re-numbers the inner UNION so all rows are numbered ordered by the event datefrom(-- select the start dates, assigning a row number to eachSelect PERSON_ID, DRUG_CONCEPT_ID, DRUG_EXPOSURE_START_DATE AS EVENT_DATE, 1 as EVENT_TYPE, ROW_NUMBER() OVER (PARTITION BY PERSON_ID, DRUG_CONCEPT_ID ORDER BY DRUG_EXPOSURE_START_DATE) as START_ORDINALfrom cteDrugTargetUNION ALL-- pad the end dates by 30 to allow a grace period f or overlapping ranges.select PERSON_ID, DRUG_CONCEPT_ID, DATEADD(day,30,DRUG_EXPOSURE_END_DATE), -1 as EVENT_TYPE, NULLFROM cteDrugTarget) RAWDATA) EWHERE (2 * E.START_ORDINAL) - E.OVERALL_ORD = 0),cteRawData as(select d.DRUG_EXPOSURE_ID, d.PERSON_ID, d.DRUG_CONCEPT_ID,d.DRUG_EXPOSURE_START_DATE, d.DRUG_EXPOSURE_END_DATE, MIN(e.END_DATE) as ERA_END_DATE,d.DRUG_TYPE_CONCEPT_ID, d.REFILLS, d.QUANTITY, d.DAYS_SUPPLY, d.RowNum,ROW_NUMBER() OVER (PARTITION BY d.PERSON_ID, MIN(e.END_DATE) ORDER BY d.PERSON_ID, d.DRUG_CONCEPT_ID, d.DRUG_EXPOSURE_START_DATE) as EraRowNumFROM cteDrugTarget dJOIN cteEndDates e on d.PERSON_ID = e.PERSON_ID and e.END_DATE >= d.DRUG_EXPOSURE_START_DATEGROUP BY d.DRUG_EXPOSURE_ID, d.PERSON_ID, d.DRUG_CONCEPT_ID, d.DRUG_EXPOSURE_START_DATE, d.DRUG_EXPOSURE_END_DATE, d.DRUG_TYPE_CONCEPT_ID, d.REFILLS, d.QUANTITY, d.DAYS_SUPPLY, d.RowNum) select row_number() over (order by person_id asc) as DRUG_ERA_ID,r.PERSON_ID, r.DRUG_EXPOSURE_START_DATE as DRUG_ERA_START_DAT E, r.ERA_END_DATE as DRUG_ERA_END_DATE, 38000182 as DRUG_TYPE_CONCEPT_ID,r.DRUG_CONCEPT_ID,1 as DRUG_EXPOSURE_COUNTinto CDM4_SIM.dbo.DRUG_ERA from cteRawData r where r.EraRowNum = 1 order by r.PERSON_ID, r.DRUG_CONCEPT_ID, r.DRUG_EXPOSURE_START_DATE;
Any of these approaches should work for OMOP CDMv4. For OMOP CDMv5, I'm planning to write up a SQL script approach loosely based on Chris' implementation that I think should work well for drug_era, dose_era, and condition_era, that we can run through SqlRender to make it work for Oracle, SQL Server, and PostgresQL. I need to do a bit more testing on that first though so I don't create too much re-work for those who'd like to implement it.
Of course, there's lots of ways to skin this cat, I'd be quite keen to hear how others have tackled this problem.
Cheers,
Patrick
To respond, reply to this email or visit http://forums.ohdsi.org/t/populating--era-tables/78/2 in your browser.
To unsubscribe from these emails, visit your user preferences.
![|1x1](upload://xcjzfbLDGQ9Ycfw3fLV0RdA4Bve.gif)