OHDSI Home | Forums | Wiki | Github

Cohort design not based on medical record

I have some trouble with setting Incidence rate analysis.
I’d like to calculate the incidence of a new occurrence of disease across all eligible persons.
The outcome cohort does not raise any questions but I have no idea how to specify the target.

The rules for it are the following: to include all persons which at some day (starting from 206-10-01) do have continuous enrollment at least 365 days.
The first day on or after 2016-10-01 where the criteria of minimum prior enrollment is satisfied should be the cohort start date and I’d like to observe them up to the end of enrollment.
Since the cohort start date is not based on any medical record, this is challenging for me.

Are there any ideas how to design the cohort? Any help will be appreciated

Here you go: http://www.ohdsi.org/web/atlas/#/cohortdefinition/1770358

I used the obseration period’s criterion: using specified period option to enforce that the person has continuous observation starting 2016-10-01 through 2017-10-01 (did you mean to start the period in october?) and all people found that have continuous observation for that entire year will be included in the cohort as of 2016-10-01. Since no exit criteria was specified, they will persist in the cohort until their end of observation.

Thanks, @Chris_Knoll.

I’ve just handled my issue using specified period start. So would like to double check that did it correctly.

The main point is that person can be entered later than in 2016-10-01 ( yes, October). The only concern is that he should have 1 year of enrollment before.

So I did it as following: http://www.ohdsi.org/web/atlas/#/cohortdefinition/1770360

Then, in the incidence rates analysis I specified time at risk from start_date + 365 to end_date + 1

Does this make sense in your opinion?

Well, you can just specify that the start is 2016-10-01 to 2016-10-01 (then anyone with just that date will be found, and the cohort start date is 2016-10-01. And, you need to specify both start and end date for that criteria.

Next: 1 year of enrollment before: just specify 365d of continuous enrollment before.

that will find people that had observation on 2016-10-01 with 365d prior. Did you want to enforce some post-2016-10-01 days?

After you establish your cohort (which starts at 2016-10-01), you can use the incidence rate tool to determine a rate of outcome (but you will need to specify a cohort of people that have the outcome).

It should be more clear if to use some examples:
Case 1: enrollment started on 2014-01-01 and ended 2016-10-02.
For such person I’m willing to have time at risk to start on 2016-10-01 and end 2016-10-02

Case 2: enrollment started on 2015-12-31 and ended on 2017-10-01
Now time at risk should be started on 2016-12-30 (the very first date when enrollment reaches 365 days)
and ended on 2017-10-01.

So date of time at risk is flexible, and is based on the date when enrollment started
( it is 2016-10-01 only if enrollment started on or before 2015-10-02)

The way of logic which I used:

  1. Filter to observations >=365 days (both use cases are included)
  2. Filter to observations ending on or after 2016-10-01 (both cases are included)
  3. For persons with enrollment starting before 2015-10-02 specify that cohort_start_date = 2016-10-01
    Else - set cohort_start_date to observation_period_start_date
    (so for case 1 cohort_start_date = 2015-10-02
    for case 2 cohort_start_date = 2015-12-31)
  4. In the IR settings, start time at risk from cohort_start_date + 365 days
    ( for case 1 this will be 2016-10-01
    for case 2 this will be 2016-12-30
    )

Btw, why do I necessary need to specify end date for that attribute? I looked through the generated SQL
and due to my understanding if end date in Specify Start and End dates attribute is not filled, then
observation_period_end_date is used as end date of primary events.

Outcome cohort is now the only thing in which I have no doubts.:grin:

You are correct! If the end date isn’t specified, it will use observatin_period end, and if start isn’t specified, it will use observation_period start.

This case is handled properly in the tool. You specify Start of 2016-10-01, and if the person has an observation period that starts before 2016-10-01 and ends after 2016-10-01, the person will be included in the cohort with a start date of 2016-10-01, an end date of their observation period end date. You further require 365d of continuous observation, so you can specify ‘with 365 days prior continuous observation’, and then, based on this rule, case 1 is included, but case 2 is not (described next).

This is something the tool doesn’t handle. Other than the ‘user specified dates’ in observation periods, the cohort start dates is always based on some date retrieved from the patient record. Ie: the tool doesn’t let you perform a complicated mathmatical expression like ‘add 365 days to the observation period start date and if it is before 2016-10-01, use 2016-10-01 else use observation period start date + 365d’. For this, I would suggest you use ‘visit events’ as a fallback if you don’t find the user-defined date with 365d prior.

You can use multiple cohort entry events to pick the user-defined start date (via observation period) OR the earliest visit occurrence after 2016-10-01. Both of these dates must have 365d continuous observation prior.

Cohort Start Dates:
Observation period with user-defined start date of 2016-10-01
OR Visit Occurrence starting after 2016-10-01
Having 365d of continuous observation
Limit to earliest event per person

Case one will enter 2016-10-01. Case 2 will enter at their earliest visit that has 365d continuous observation after 2016-10-01.

Checking your math here, if case one has a cohort_start date of 2016-10-01, why will the time at risk starting with cohort_start_date + 365 days result in a value of ‘2016-10-01’? wouldn’t you just start the time at risk at the cohort_start_date + 0 days?

@Chris_Knoll Is this the “Add Study Window” you’re referring to? Does that work in similar fashion as the trimming options in Cohort Definitions (ex. Cohort Censor Window)?

Is this the default function of Incidence Rates or is it determined by the cohort exit strategy of the Target cohort?

Hi, @George_Argyriou,
No, the function I was referring to was using ‘Observation Period Criteria’ to allow a user to specify an arbitrary start/end date that will be used in the cohort entry events that will result in a user-defined cohort_start_date.

The ‘Add Study Window’ is a function in Incidence Rates which will limit the people included in the calculation if their cohort_start_date is between the study_start and study_end (or just after the study start, if only the study_start date is specified).

This is going to sound confusing but the ‘Study Window’ only does ‘right-side’ censoring, but ‘left side filtering’, meaning: only people where the cohort_start_date between the study_start_date and study_end_date will be included in the analysis. The Time At Risk will not start as based on the study window, it is always based on the cohort_start_date. That’s why I call this ‘left side filtering’. However, if you specify a ‘study_end_date’, it WILL use the study_end date as a ‘time at risk end date’ if the study_end_date occurs before the TAR end. This is why I call it ‘right side censoring’.

I’m a little embarrassed to say that the decision was to do it this way was a matter of simplicity: in order to do complete left censoring properly, you have to account for the cohort_start_date and TAR start adjustment, and it was simpler to just say ‘The only people included are those that cohort_start between the study start and study end’. The study end date is much simpler to handle for censoring: the TAR ends at the earliest of the observation period end date, the study_window end date, or an outcome event date.

However, you can still get the result you want by taking care of the left censoring in the ‘cohort definition’ (that will ‘trim’ the cohort_start dates to a fixed start date), and handle the right censoring using a study window in the IR analysis. In the future, we could work to have this handled completely in the IR Analysis tool, but for now, the simple approach in the IR tool means if you want true ‘left-right censoring’, you will need to censor the cohort_start_dates in the cohort definition, and censor the TAR using the Incidence Rate Study Window setting.

that’s the default behavior of the Cohort Definition: if you don’t specify how a cohort exit date is calculated, by default it uses the containing observation period end date.

The incidence rate doesn’t have a ‘default function’ in this way: part of the Incidence Rate is to define how the Time At Risk is defined based on an offset from the cohort_start_date. These are not optional: you must specify these settings. However, if you also add a study window, it will do the ‘left-filter, right censor’ described above.

Here are some use cases that I’ve seen:
Case 1: 1 year follow up after cohort entry:
TAR Start: cohort_start + 0d, TAR End: cohort_start + 365d

Case 2: During exposure (this assumes cohort_start/end represent the period of continuous exposure
TAR Start: cohort_start + 0d, TAR End: cohort_end + 0d

Case 3: During Exposure with a 30d surveillance window:
TAR Start: cohort_start + 0d, TAR End: cohort_end + 30d

Case 4: 6 months after treatment ends:
TAR Start cohort_end + 0d, TAR END: cohort_end + 183d

Hopefully you see how this all comes together, and how you have to take the meaning of the cohort_start / cohort_end when defining your TAR in the Incidence Rate tool.

-Chris

Thanks a lot for these clarifications. It makes much more sense now.

Two questions:

  1. Where can I find the SQL code behind Incidence Rates?

  2. Are all the above going to be included in the upcoming Book of OHDSI?

t