Cohort table questions

MPhilofsky · April 4, 2023, 9:36pm

We are using the Cohort table to identify cohorts derived from our source + OMOP data. We will mostly be using these data for feasibility studies. Of the persons with Condition X how many are part of Cohort Y?

The Cohort table does not have a primary key. Shouldn’t the table have a PK? The Cohort Definition table has a PK, as do the clinical event tables and the derived tables.

Also, from the conventions I found this blurb which doesn’t make sense “A subject can only have one record in the cohort table for any moment of time, i.e. it is not possible for a person to contain multiple records indicating cohort membership that are overlapping in time”. A person can have overlapping cohort records when they are a member of more than one cohort. Does the conventions statement need to be reworded? Or am I misinterpreting it?

Chris_Knoll · April 5, 2023, 1:56am

It’s not really necessary: the records here don’t have something uniquely identifying that you’d use as a foreign key in some other context. If there was a primary key , it would be cohort_definition_id, subject_id and start_date.

The language should be cleaned up to be more precise: a subject can only have one record in the cohort table that represents a period of continuous time for a given cohort definition id, such that this period of time can not overlap with any other record for the person in the same cohort. It’s a complicated context to describe in words, but the reason for the rule is that when measuring follow up time for a person using a cohort, you do not want to double-count time for a person, so it’s important that no records for a person (within a cohort) overlaps.

MPhilofsky · April 6, 2023, 5:12pm

Right, the Cohort table doesn’t have a PK-FK relationship with any other table, so it’s not necessary.

I like your suggested wording and explanation. I’ve tagged it with ‘Themis’ for review by the Themis WG before adding to the CDM conventions.