OHDSI Home | Forums | Wiki | Github

Cohort definition for general population

Hello everyone!

I was wondering what is the best way to construct a general population cohort in the data after a certain year (e.g., number of adults in the database after 2010). I am looking into this in order to get prevalence estimates. I have tried definining the cohort based on observation period (since I know that this is supposed to specify the entry to the database) and condition/observation occurrence of any condition/occurrence; however, these led to very different results. It would also be good if you consider that I would later on use inclusion criteria with the disease-specifying concept sets.

Related to that - I am also wondering why defining the age in inclusion criteria vs in restriction of initial events lead to different results, when I limit the initial events as the earliest event. I guess there is something I conceptually do not understand regarding these crtieria, so any further insight would be appreciated.

What is the provenance of the data you’re using (eg EHR, claims)? The fact that you get different estimates makes total sense since there are many people who made it into the database but don’t utilize healthcare services actively. Then, if you have claims you can generally trust your observation period and just go with it as your inclusion criteria. For EHR you would probably also use observation period although ensuring that you have sufficient coverage in EHR (ie that your included people come only to you institution and you have accurate prevalence estimates) is a separate hard question. Also, if you don’t care about specific years you can simply use Data Sources tab in Atlas to get your prevalence estimates

1 Like

Thank you very much for your comprehensive reply, I was indeed looking for an answer for both EHR and claims data. This was very helpful for me to understand what the numbers really correspond to. I was wondering regarding my other question: when determining age limits for the population do you recommend using the inclusion criteria approach or restriction of initial events?

t