We are using the Cohort table to identify cohorts derived from our source + OMOP data. We will mostly be using these data for feasibility studies. Of the persons with Condition X how many are part of Cohort Y?
The Cohort table does not have a primary key. Shouldn’t the table have a PK? The Cohort Definition table has a PK, as do the clinical event tables and the derived tables.
Also, from the conventions I found this blurb which doesn’t make sense “A subject can only have one record in the cohort table for any moment of time, i.e. it is not possible for a person to contain multiple records indicating cohort membership that are overlapping in time”. A person can have overlapping cohort records when they are a member of more than one cohort. Does the conventions statement need to be reworded? Or am I misinterpreting it?
It’s not really necessary: the records here don’t have something uniquely identifying that you’d use as a foreign key in some other context. If there was a primary key , it would be cohort_definition_id, subject_id and start_date.
The language should be cleaned up to be more precise: a subject can only have one record in the cohort table that represents a period of continuous time for a given cohort definition id, such that this period of time can not overlap with any other record for the person in the same cohort. It’s a complicated context to describe in words, but the reason for the rule is that when measuring follow up time for a person using a cohort, you do not want to double-count time for a person, so it’s important that no records for a person (within a cohort) overlaps.