Making sense of date offsets in generated SQL

cce · July 14, 2019, 9:09pm

Hello. I’m trying to grok the SQL generated in the Atlas interface, and in particular, how date offsets for period inclusion is done for various kinds of queries.

For the cohort generated by Acute myocardial infarction events, the condition occurrence criteria has missing condition_end_date coalesce to the condition_start_date + 1 day.
For the cohort generated by New users of ACE inhibitors as first-line monotherapy for hypertension, coalesce of drug_exposure_end_date follows the same pattern, using drug_exposure_start_date plus 1 day. Yet, later, the coalesce includes an attempt to recover a time period from days_supply.

So, I figure the answer to the 1st is that the logic is to be “exclusive” logic, where the end_date is often the day after, as hinted at by this 2015 thread. However, I don’t see the corresponding logic I’d expect, where one would use < or a -1 adjustment in the filter, rather than an inclusive logic, <=.

In the second question, perhaps the answer has something to do with only wanting to track when the drug was prescribed rather than the whole window in which it was delivered? In this case though, why let the end_date be used since it could have been filled in via ETL as including the entire prescription period. In either case, it seems that there could be inconsistency as to how the query is interpreted depending upon how the data is loaded.

Generally, I’m still looking for formal documentation that describes exactly how dates are treated and corresponding regression test cases. Would someone who is familiar with the SQL generation please enlighten me? Thanks!