What is "the magic" in cohort definitions

Chris_Knoll · June 8, 2018, 10:53pm

I guess I shouldn’t leave that treatment pathway statment hanging, so let me demonstrate how that works:

Given the pattern:
A-B-B-B-C-B-B-C-C-C-D
and you need to remove repeats, and reduce it to:
A-B-C-B-C-D

Step 1: rownumber the events:

element, rn
A, 1
B, 2
B, 3
B, 4
C, 5
B, 6
B, 7
C, 8
C, 9
C, 10
D, 11

Next, add a rownumber, but partition by element to the result

element, rn, p_rn
A, 1, 1
B, 2, 1
B, 3, 2
B, 4, 3
C, 5, 1
B, 6, 4
B, 7, 5
C, 8, 2
C, 9, 3
C, 10. 4
D, 11, 1

Notice that duplicates get grouped when you subtract rownum - partitionrow:

element, rn, p_rn, rn-p_rn
A, 1, 1, 0
B, 2, 1, 1
B, 3, 2, 1
B, 4, 3, 1
C, 5, 1, 4
B, 6, 4, 2
B, 7, 5, 2
C, 8, 2, 6
C, 9, 3, 6
C, 10. 4, 6
D, 11, 1, 10

Now if you select element, min(rn) FROM … group by element, rn-p_rn you get:

A, 1
B, 2
C, 5
B, 6
C, 8
D, 11

Ordering by the initial RN, you get
A-B-C-B-C-D

-Chris