opened 01:32PM - 22 Aug 19 UTC
closed 02:19PM - 10 Dec 19 UTC
pr filed
ir calc
When creating an incidence rate calculation with multiple strata, and one of the… strata groups included the entire cohort, there is a mismatch in the summary time at risk compared to the individual strata group's time at risk, as shown below:
![image](https://user-images.githubusercontent.com/6818777/63518479-203a4c80-c4bf-11e9-823e-ea9cbb689625.png)
Note: this is on PDW. This appears to be due to how the CAST(SUM(tar/365.25) as bigint) is casting all of the inner expressions to BIGINT before summing, which leads to a small truncation of decimal values for each row aggregated. The expectation was that the SUM would sum the decimal values together and then cast the result to BIGINT, but that does not appear to be the case.
After some testing, the fix is to write it in the following form:
old:
```
select 77 as analysis_id, T.target_id, T.outcome_id, CAST(E.strata_mask AS bigint) as bitmask,
COUNT(subject_id) as person_count,
CAST(sum(1.0 * time_at_risk / 365.25) AS BIGINT) as time_at_risk,
sum(is_case) as cases
from #time_at_risk T
JOIN (
select E.event_id, E.person_id, E.start_date, E.end_date, SUM(coalesce(POWER(cast(2 as bigint), SC.strata_sequence), 0)) as strata_mask
FROM #analysis_events E
LEFT JOIN #strataCohorts SC on SC.person_id = E.person_id and SC.event_id = E.event_id
group by E.event_id, E.person_id, E.start_date, E.end_date
) E on T.subject_id = E.person_id and T.cohort_start_date = E.start_date and T.cohort_end_date = E.end_date
GROUP BY T.target_id, T.outcome_id, E.strata_mask
)
```
Fixed:
```
select 77 as analysis_id, T.target_id, T.outcome_id, CAST(E.strata_mask AS bigint) as bitmask,
COUNT(subject_id) as person_count,
CAST(ROUND(sum(1.0 * time_at_risk / 365.25), 0) AS BIGINT) as time_at_risk,
sum(is_case) as cases
from #time_at_risk T
JOIN (
select E.event_id, E.person_id, E.start_date, E.end_date, SUM(coalesce(POWER(cast(2 as bigint), SC.strata_sequence), 0)) as strata_mask
FROM #analysis_events E
LEFT JOIN #strataCohorts SC on SC.person_id = E.person_id and SC.event_id = E.event_id
group by E.event_id, E.person_id, E.start_date, E.end_date
) E on T.subject_id = E.person_id and T.cohort_start_date = E.start_date and T.cohort_end_date = E.end_date
GROUP BY T.target_id, T.outcome_id, E.strata_mask
```
In the fixed version the ROUND seems to be applied after the SUM completes, and the resulting round value is cast into bigint (to store the final result as an integer). This leads to a matching result of the ovarll cohort IR and the strata group that contains all members of the cohort.