Phenotype Phebruary Day 8 - Kidney Stones

Team:

For those who missed any of the fun in Phenotype Phebruary Week 1, here’s a page with a running inventory of the phenotype conversations we’ve started: Phenotype Phebruary Daily Updates – OHDSI As we embark on Week 2 of Phenotype Phebruary, I decided to select a phenotype that was highly voted on by our community that was more surprising to me. I don’t know why kidney stones were of such interest to so many of you, but the community has spoken and I like the challenge of coming up with a fun and compelling story to promote phenotyping, so here goes…

Clinical description:

Kidney stone disease occurs when a calculus develops in the urinary tract, often starting in the kidney and passing through the ureters, bladder and urethra. This phenomenon, also known as neprolithiasis or urolithiasis, can be asymptomatic if the kidney stones are small enough. However, the size, shape and composition of calculi can vary substantially, and larger stones can create obstructions at any stage across the urinary tract. Such blockages can cause acute pain, typically presenting in the lower back or abdomen, and may also cause painful urination or hematuria. Kidney stones are typically diagnosed by symptoms, urine tests and imaging. Treatment often is based on patient symptoms. Pain management and hydration can allow some stones to pass spontaneously. Drugs to expedite passage, such as alpha blockers and calcium channel blockers, can be considered. Shock wave lithotripsy can be used in some circumstances to break stone into smaller pieces. Surgical removal via nephrolithotomy or ureteroscopy may be indicated depending on stone size and patient comorbidities and pain intensity.

Phenotype development:

One of the surprising statistics that I read across many different references was the notion that half of the people who have had a kidney stone will have another 10 years. This number was thrown around in many publications, and yet most cited the same seminal paper: Uribarri et al, “The first kidney stone”, from BMJ in 1989. That paper provides a review of past studies that examined recurrence of kidney stones, providing their Table 1 compilation of the identified studies:

image

There were six papers cited as sources for this meta-analysis. Here are the specific citations:

image

Now, I don’t know about you, but I’m not fully comfortable generalizing for the experience of sailors in the Royal Navy prior to 1969 to my cushy sedentary lifestyle here in 2022. But, the pattern shown across these studies show some notion of an expectation that ~10-20% of patients with a kidney stone can expect a recurrence in 1 year, 25-50% will have a recurrence in 5 years, and 37-67% may have a recurrence in 10 years.

So, with this data point in context, I was surprised to read Alexander et al, “Kidney stones and kidney function loss: a cohort study” in BMJ in 2012. In their paper, they described their phenotype for kidney stones, to be applied their Alberta Kidney Disease Network database:

“We used physician claims, data on use of hospitalisation and ambulatory care, and ICD-9 codes (592, 594, 274.11) and ICD-10 codes (N20.0, N20.1, N20.2, N20.9, N21.0, N21.1, N21.8, N21.9, N22.0, N22.8) to identify presentations of kidney stones. The accuracy of these codes in defining a kidney stone episode has been validated.12 One or more of these codes in any position for any inpatient or outpatient claim was taken to represent an episode of kidney stones. We assumed that an interval of a year or more between claims represented separate kidney stone episodes; claims occurring within a year of each other were classified as a result of a single stone episode.16 17”.

What I particularly took note of is “we assumed an interval of a year or more between claims represented separate kidney stone episodes”, because Uribarri’s results suggest that this assumption would stimulate substantial event misclassification, with recurrent events being bundled with the original event.

Which brings us to the phenotype development methodological question: how do we model recurrent events, balancing the potential errors of falsely combining together separate events into a single episode vs. the risk of falsely counting events as independent episodes when they are in fact markers of continuation of care for the same health experience?

To answer that question: Let’s create some phenotypes and empirically evaluate the consequences!

First, the conceptset for kidney stones. Turns out that its pretty straightforward, using literature as a starting reference, using the vocabulary hierarchies and PHOEBE recommendations:

This simple conceptset resolves to 60 concepts, but <10 are driving the cohort based on record count. the main one being ‘Kidney stone’:

The logic for the cohort is also straightforward, we take all events of a condition occurence of the ‘Kidney stone’ conceptset:

But, I’ll draw your attention to the last line of that screenshot: Cohort Eras → Specify era collapse gap size = 30 days. This means that if successive observed are within 30 days of one another, they will be collapsed together into a single episode. There must be a gap of >30 days in order for two events to be identified as separate episodes.

But, how do we decide on the era collapse gap size? This is whether clinical knowledge and understanding of the biological phenomenon is useful, but also one needs to understand how healthcare data may be captured for the clinical event of interest. Small stones may pass spontaneously in 4 weeks, so 30 days would cover this duration, but may not necessarily cover the duration from first presenting symptom and complete resolution. But, Alexander et al’s assumption that the gap size should be 365d means that recurrences within 1 year are not possible (in spite of evidence supporting the contrary).

What can we do? Test the impact of multiple alternative gap size windows on the number of events that are identified. The number of persons with at least one event will remain the same, but the number of events per person can shift substantially.

For this exercise, I created three phenotypes for kidney stones, the links to ATLAS-phenotype are below:

  1. [Phenotype Phebruary][Kidney stones] Kidney stone events with 30d era gap
  2. [Phenotype Phebruary][Kidney stones] Kidney stone events with 90d era gap
  3. [Phenotype Phebruary][Kidney stones] Kidney stone events with 365d era gap

I evaluated these 3 phenotypes against the IBM MarketScan CCAE database.

First, I calculated how many events each person in the cohort had, across each definition, and then summarized the number of persons by number of events. I did this via direct SQL on the COHORT table that is populated by ATLAS in the RESULTS schema:

create table #kidney_stone_events as
select *, case when cohort_definition_id = 5291 then 'Kidney stone with 30d era'
              when cohort_definition_id = 5292 then 'Kidney stone with 90d era'
              when cohort_definition_id = 5293 then 'Kidney stone with 365d era'
              else 'Other' end as cohort_name
from cohort
where cohort_definition_id in (5291, 5292, 5293)
;

--find number of persons, by number of unique events, within each cohort definition
select num_events,
    max(case when cohort_name = 'Kidney stone with 30d era' then num_persons else 0 end) as num_persons_30d,
    max(case when cohort_name = 'Kidney stone with 90d era' then num_persons else 0 end) as num_persons_90d,
    max(case when cohort_name = 'Kidney stone with 365d era' then num_persons else 0 end) as num_persons_365d
from
(
select cohort_name, num_events, count(subject_id) as num_persons
from
(
select cohort_name, subject_id, min(cohort_start_date) as first_event_date, count(subject_id) as num_events
from #kidney_stone_events
group by cohort_name, subject_id
) t1
group by cohort_name, num_events
) t2
group by num_events
;

And the results show that the era gap window does have a substantial impact on the number of events observed per person, take a look at the tail of this distribution:

num_events num_persons_30d num_persons_90d num_persons_365d
1 2036018 2271273 2668675
2 558099 512921 392894
3 245649 196161 97258
4 129054 92984 30028
5 75295 49807 10426
6 47471 28971 3778
7 31048 17520 1284
8 21430 11319 456
9 14840 7574 155
10 10792 4906 67
11 8149 3451 20
12 5801 2373 8
13 4532 1648 1
14 3495 1174 0
15 2762 861 0
16 2070 585 0
17 1653 418 0
18 1344 316 0
19 1090 228 0
20 811 175 0
21 620 119 0
22 524 89 0
23 422 49 0
24 350 47 0
25 285 23 0
26 245 18 0
27 201 14 0
28 165 8 0
29 126 7 0
30 121 3 0
31 107 6 0
32 66 1 0
33 63 1 0
34 52 0 0
35 47 0 0
36 43 0 0
37 36 0 0
38 27 0 0
39 23 0 0
40 14 0 0
41 9 0 0
42 11 0 0
43 12 0 0
44 13 0 0
45 11 0 0
46 10 0 0
47 9 0 0
48 7 0 0
49 5 0 0
50 3 0 0
51 3 0 0
52 4 0 0
53 4 0 0
54 1 0 0
55 1 0 0
56 4 0 0
59 1 0 0
62 1 0 0
70 1 0 0

Given this, we can expect that the recurrence rate would vary by the gap window length. But how much? Here’s a simple query to estimate the recurrence rate at 1 year, 5 year, and 10 year intervals (similar to the original Uribarri paper). Note, we require persons to be observed for the full time-at-risk to be included, which is why the denominators shift lower as the interval is enlargened:

--estimate recurrence proportions at 1/5/10 year intervals
select '1. 1 year recurrence proportion' as time_at_risk,
    first_events.cohort_name, 
    count(distinct first_events.subject_id) as num_persons_at_risk, 
    count(distinct kse1.subject_id) as num_persons_w_event,
    1.0*count(distinct kse1.subject_id) / count(distinct first_events.subject_id) as pct_persons_w_event
from
(
select cohort_name, subject_id, min(cohort_start_date) as first_event_date
from #kidney_stone_events
group by cohort_name, subject_id
) first_events
inner join observation_period op1
on first_events.subject_id = op1.person_id
and first_events.first_event_date >= dateadd(day,365,op1.observation_period_start_date)
and first_events.first_event_date <= dateadd(day,-365,op1.observation_period_end_date)
left join #kidney_stone_events kse1
on first_events.cohort_name = kse1.cohort_name
and first_events.subject_id = kse1.subject_id
and first_events.first_event_date < kse1.cohort_start_date
and dateadd(day,365,first_events.first_event_date) >= kse1.cohort_start_date
group by first_events.cohort_name

union all 

select '2. 5 year recurrence proportion' as time_at_risk,
    first_events.cohort_name, 
    count(distinct first_events.subject_id) as num_persons_at_risk, 
    count(distinct kse1.subject_id) as num_persons_w_event,
    1.0*count(distinct kse1.subject_id) / count(distinct first_events.subject_id) as pct_persons_w_event
from
(
select cohort_name, subject_id, min(cohort_start_date) as first_event_date
from #kidney_stone_events
group by cohort_name, subject_id
) first_events
inner join observation_period op1
on first_events.subject_id = op1.person_id
and first_events.first_event_date >= dateadd(day,365,op1.observation_period_start_date)
and first_events.first_event_date <= dateadd(day,-365*5,op1.observation_period_end_date)
left join #kidney_stone_events kse1
on first_events.cohort_name = kse1.cohort_name
and first_events.subject_id = kse1.subject_id
and first_events.first_event_date < kse1.cohort_start_date
and dateadd(day,365*5,first_events.first_event_date) >= kse1.cohort_start_date
group by first_events.cohort_name


union all 

select '3. 10 year recurrence proportion' as time_at_risk,
    first_events.cohort_name, 
    count(distinct first_events.subject_id) as num_persons_at_risk, 
    count(distinct kse1.subject_id) as num_persons_w_event,
    1.0*count(distinct kse1.subject_id) / count(distinct first_events.subject_id) as pct_persons_w_event
from
(
select cohort_name, subject_id, min(cohort_start_date) as first_event_date
from #kidney_stone_events
group by cohort_name, subject_id
) first_events
inner join observation_period op1
on first_events.subject_id = op1.person_id
and first_events.first_event_date >= dateadd(day,365,op1.observation_period_start_date)
and first_events.first_event_date <= dateadd(day,-365*10,op1.observation_period_end_date)
left join #kidney_stone_events kse1
on first_events.cohort_name = kse1.cohort_name
and first_events.subject_id = kse1.subject_id
and first_events.first_event_date < kse1.cohort_start_date
and dateadd(day,365*10,first_events.first_event_date) >= kse1.cohort_start_date
group by first_events.cohort_name

;

In CCAE, this provide us kidney stone recurrence proportions as follows:

time_at_risk cohort_name num_persons_at_risk num_persons_w_event pct_persons_w_event
1. 1 year recurrence proportion Kidney stone with 30d era 1276749 341312 0.27
1. 1 year recurrence proportion Kidney stone with 90d era 1276749 213682 0.17
1. 1 year recurrence proportion Kidney stone with 365d era 1276749 0 0.00
2. 5 year recurrence proportion Kidney stone with 30d era 352293 163280 0.46
2. 5 year recurrence proportion Kidney stone with 90d era 352293 144931 0.41
2. 5 year recurrence proportion Kidney stone with 365d era 352293 110024 0.31
3. 10 year recurrence proportion Kidney stone with 30d era 80623 45454 0.56
3. 10 year recurrence proportion Kidney stone with 90d era 80623 42458 0.53
3. 10 year recurrence proportion Kidney stone with 365d era 80623 37921 0.47

So, we can see that the gap size window has a substantial impact on 1 year recurrence; by definition, a 365d gap window means that no recurrence is possible, but the recurrence proportion varies from 17% for a 90d gap window to 27% for a 30d gap window. The 5-year recurrence proportions range from 31% for 365d gap to 46% for the 30d gap. And the 10-year recurrence proportion ranges from 47% for 365d gap to 56% for 30d gap. So the impact of gap window decreases as you expand the time-at-risk window.

Remarkably, the real-world experience in the privately insured population of the CCAE claims dataset through 2021 is not altogether different from those of sailors in the Royal Navy pre-1969 or Minnesotans pre-1979.

My takeaways from this exercise: defining kidney stone events reinforces the need to follow the OHDSI definition of a cohort: a set of persons satisfying one or more criteria for a duration of time. While it is critically important to define the entry events and inclusion criteria correctly, it is equally important to pay attention to the cohort exit strategy, and ensure that you are appropriately modeling when episodes are expected to be over. In this case, kidney stone entry events were straightforward to implement, but the exit strategy used to collapse episodes had a major impact on estimates of recurrence.

How do you think about modeling recurrent events when phenotyping? What do you consider when weighing the tradeoff of ‘false positive’ events where a case is really a follow-up from a prior episode vs. the ‘false negative’ of collapsing together events that are truly independent events? What empirical diagnostics should we put in place to evaluate this tradeoff?

This is a very interesting perspective on condition eras, which we indeed usually do not think hard about. I would think that if we are considering recurrence as opposed to just incident/prevalent cases we need to think about the treatment as well. E.g. if a patient has an episode, underwent a lithotripsy and then had acute symptoms after it, we would call it a recurrent episode. Of course we still need to think about era collapse time interval, but it may be somehow easier since we now have a procedure as a divider. Alternatively, in this case acute episodes may be distinguished from chronic by using a combination of diagnosis+ER or diagnosis+US or diagnosis+acute symptoms like acute pain (wish they were represented better in the data!).

@aostropolets , how would we incorporate treatment? Are you suggesting it could be a exit right-censoring criteria? In your example, I would imagine we need to model some sort of tolerance for date misclassification (e.g. if a symptom diagnosis was seen within a short window after treatment (~3d?), it may be part of same episode, but it more than some other interval, then clearly new. This would be neat to try to use the cohort start and end as the true duration of the episode, rather than just as markers of episode starts.

Your comment about treatment made me wonder: while it wasn’t done in any of the papers I saw, why wouldn’t a procedure of lithotripsy be a qualifying entry event? Is there a reason for that procedure that isn’t a stone? (Thinking about the edge case where we see the procedure code but not the diagnosis).

I’d think it may just as well be!

Yes, I think so. Or more broadly speaking: let’s say you have a kidney stone passing down your urinary tract. You’re in pain, you probably have fever and go straight to ER/your doc (unless you’re from Eastern Europe, then you may try eating watermelon and sitting in a hot tub). It’s sort of hard to imagine that you would walk around with an acute pain for 30 days…And if the stones are small and cause no discomfort, what are the chances you sought care for them? So I would argue any acute episode should be resolved within days and chronic episode would be resolved with lithotripsy or other treatment. Now, back to you question about the follow-up encounters when the code just got copied from the previous records - maybe, we can distinguish those by not observing treatment: pain killers, NSAIDS, antispasmodics (and yes, all of a sudden the phenotype becomes super-complex)?