OHDSI Home | Forums | Wiki | Github

On 'phenotype flavours'

I think you answered your own question. As the researcher who decided that sensitivity errors is a bigger problem than index date misclassification errors. So you may want to use a cohort definition that has the h/o code in the entry event definition.

The problem with history of migraine code is that - all it means is we do not know the the onset date i.e. cohort_start_date. You are getting a date sometime after the disease started. It becomes your study limitation.

Mm… I disagree. My whole motivation for “phenotype flavours” at the top of this thread is that there are use cases where the inclusion of certain codes (here the “h/o” toy example) does in fact reduce error

E.g. In a prevalence study, as your characterization demonstrates, the exclusion of H/O migraine underestimates prevalence. I do not see how this is an error, because this use case does not care about the date when you were diagnosed with migraine

And this is just an example. There are others like drug utilisation studies where you want to assess indications, or the longitudinal modelling of comorbidity or frailty

@Daniel_Prieto The motivation behind my question is that in some healthcare systems people are asked to report all their health problems, etc upon registration

This was addressed in a 2005 paper by Jim Lewis at Penn. Differences id health care delivery systems may/will affect how information is recorded. If you develop a comorbidity score that only looks back 1yr and captures diagnoses and apply that to a US claims you get good capture since every time you come in we bill for diabetes. If you apply this same restrictive one year look back to an EHR system that might have a one code chronic diagnoses code recorded on a problem list you would miss this comorbidity. Underlying data and underlying questions do mater in application of phenotypes.

1 Like

So - i am thinking in terms of how OHDSI defines an instantiated cohort table. It would look something like this

cohort_definition_id subject_id cohort_start_date cohort_end_date
1 1 1/1/2015 9/15/2016
1 2 2/15/2015 8/15/2015
1 3 2/28/2015 10/15/2016
1 4 3/15/2015 9/16/2015

Now we have to use this for calculating prevalence or incidence. If we are calculating incidence we take cohort_start_date, if we are doing prevalence we can take any date.

So - using this definition of prevalence NIMH » What is Prevalence?

  • Prevalence may be reported as a percentage (5%, or 5 people out of 100), or as the number of cases per 10,000 or 100,000 people. The way prevalence is reported depends on how common the characteristic is in the population.
  • There are several ways to measure and report prevalence depending on the timeframe of the estimate.
    • Point prevalence is the proportion of a population that has the characteristic at a specific point in time.
    • Period prevalence is the proportion of a population that has the characteristic at any point during a given time period of interest. “Past 12 months” is a commonly used period.
    • Lifetime prevalence is the proportion of a population who, at some point in life has ever had the characteristic.

We can calculate the period prevalence and the period is the 12 months of calendar year 2015 - all of the persons above would be eligible for numerator. If we wanted 2014 - none of them would be eligible.

If we have index date misclassification - there would be no problem for 2015 calendar year. We will have a problem for 2014 calendar year.

If a person enters the cohort because they had ‘history of migraine on 1/1/2015’ their cohort_start_date is still 1/1/2015. So they are not contributing to 2014 computation - while semantically we dont know when they truly started to have migraine, because all we know on 1/1/2015 is that they have history of migraine. So we have sensitivity error for 2014.

So maybe I don’t understand how the use case does not care about dates.

My argument above is based on using the construct of a cohort/phenotype. Maybe you are talking about a different construct ie not phenotyping?

That yet again reminded me…do you guys still distinguish between target/outcome/feature cohort? for example, if the cohort of migraine is a feature cohort that will be used for, say, Table 1, wouldn’t we include h/o of migraine since we are tolerant to index event misspecification and care more about sensitivity?

A side note: if you’re still working on that big phenotype paper, it would be nice to include all of the above in it (nice figures, yesss!). And if any help is needed, I’d love to contribute

The way I think about it in your specific example is that I would exclude the patients with prior outcome on the cohort or analysis level. And then E11.2 would go into the outcome list and everybody with those codes prior to the index date would be excluded. Now, if that’s the only diabetes-related code a person has, I’d explore it more as this patient is suspicious to begin with.

In other words, potential causal relationship should be handled either at the analysis stage or manually (eg full-blown chart adjudication). That’s my take which can be different from the line of the party :slight_smile:

1 Like

Fun discussion so far.

No - i dont think that was a valid idea.

Given a target (clinical description of migraine) we develop one or more cohort definitions (like case definition but at the population level). Each of this cohort definition is a model with different errors - that we currently describe mostly in sensitivity, specificity, date misclassification. The cohort definition tries to give us a persons best cohort_start_date to cohort_end_date - if it is off, its an error.

If you are tolerant to index date misclassification - use the H/O code. If you are not tolerant and dont use H/O code, then you have sensitivity errors.

you are always welcome.

I think the usage of exposure vs outcome vs feature cohort came when there was discussions that certain types of errors are ok for exposure cohorts, but not ok for outcome cohorts. e.g. i have heard arguments that sensitivity errors are less problematic for exposure cohorts but more problematic for outcome cohorts.

Does anyone know if these ideas have been clearly articulated and, hopefully, empirically tested? If not, then promoting those ideas (i.e. feature cohort vs outcome cohort vs exposure cohort) may not be valid.

These clearly warrant different phenotypes. Those where we care about the specificity (yea I used the word to be confusing!) of the index date and those where we do not. If I need a phenotype to define a baseline condition either for exclusion (or even establishing inclusion/exclusion criteria for a trial screening) or confounding adjustment I don’t care when you had it prior to some index exposure date of interest so as soon as my data knows this condition exists it’s good enough for me (US data sources for chronic conditions will endlessly diagnose me with this condition in annual 365d increments, UK/NHS/EMR systems might record this and move on). If entering this diseased cohort is my exposure of interest then it is my index date and I’m quite sensitive (again yea I choose the word to be confusing!) to the index date. Same for outcome where I do really need to know what date you officially meet the phenotype.

When I change insurance carriers at open enrollment my first phone call is to member services to record that my appendix has been removed so that any observational study conducted within the resource can appropriately remove me from the at risk population for appendicitis which is why you’ll see a Z90.89 “Acquired absence of other organs” with a free text comment ‘appendix’. I’ll probably instruct them to record it as a telehealth visit to make it look like I took out my appendix over the phone.

Short of complete data linkage from birth to Medicare and death. I think there is value in the nuggets of information left by H/O codes or what appear to be train wreck patients in the first six months within a GP office but are really just real world data collecting a patient history.

I have heard it well-articulated from You-Know-Who. Tested? No. We have a problem with phenotype evaluation as you know :slight_smile: Should do more stuff with Joel.


It could be a fun project to look at the h/o codes in the systems with better coverage vs fragmented care.

1 Like

Just making sure these ideas are linked.

In your example, the history of code would be useful for December 2010 prevalence, bad for incidence. For incidence we need precise start date. For prevalence we need to know if it started prior to and person still has the phenotype.


This is exactly my point, @aostropolets

The only piece of data on this thread (characterization proposed above) shows that by dismissing ‘H/O’ codes you will be potentially underestimating prevalence of a feature (by not counting thousands of people with a history of disease) as well as overestimating incidence of a target (by counting as new events that were already known/prevalent) in many data sources (European primary care data), whilst not in others.

Of course I care about dates, that is precisely why I want to use ‘H/O’ codes differently depending on the use of a cohort/phenotype as a ‘feature’ vs ‘outcome’. Your solution to simply not use these codes results in the dismissal of thousands of cases, which leads to systematic bias due to incompleteness of prevalent and overestimation of incident events.

This has nothing to do with how OHDSI constructs a cohort/phenotype. Let us not forget people have been creating cohorts from RWD for a while, also out of the OHDSI community. And this ‘issue’ is well known to many of us who use European primary care data for research, both in and beyond OHDSI.

So based on published data (thanks @Kevin_Haynes for the reference), internal data, and the characterisation above, it is clear that having phenotype flavours is useful to minimise incompleteness if you care about having a fuller picture of prevalence and/or incidence. Of course, this does not apply to all study types, but it does apply to a lot of the work we are currently doing in my team.

The example of estimating prevalence and ‘H/O’ codes is just one use example as I said in my first post. Other ‘flavours’ of phenotypes will be needed if one is to reuse these phenotypes for future studies. Target, outcome, feature flavours is a good starting point

thank you all!

1 Like

No, I did not say dismiss it. I am saying when we use it we will not know that incident date precisely. It causes an error.

If your use case is prevalence and the study period is after the date the code was used - then it’s fine. If your use case is prevalence before that then we have a problem of understanding prevalence.

Example: if the history of code was found April 1st 2000, does that mean the person has the prevalent disease on February 2000.? How about prevalence calculation on March 1999?

In reality the person may have had migraine since 1995, then they changed systems, a new system captured the history code on April 1st 2000

that’s all it is. If the history of codes are found to be used AFTER the period you calculated prevalence for - then you have a source of error.

OK, now I am confused with this:

Because if I scroll up …

I think this reads to me like “do not use”. so yes, you said to dismiss it

Now, on this other topic:

I cannot think of any scenario where I would use any code (h/o or otherwise) recorded AFTER the period we calculate prevalence. that would no longer be a prevalence. that is true for all and any codes, irrespective of their ‘h/o’ nature. So I am not worried because this is not a problem for prevalence analyses if that calculation is done correctly, i.e. using data from the time on or before the date when the prevalence is obtained on

talking about error: what about the issue with underestimating prevalence and potentially overestimating prevalence observed when ‘dismissing’ the “H/O” code for migraine seen in the characterisation discussed above? this DOES worry me

So - i am still thinking that we should not use H/O code in entry event criteria because it is causes imprecise index date (cohort start date). In most data sources, like you said, these codes are practically non existent - so they did not matter. You are bringing up a scenario where the codes are being used in sufficient numbers that it matters. So, we have to decide how to handle them. If we use them - we get imprecise start date, if we dont use them we have lower sensitivity.

Errors: there is errors in everything we do. The science is to reduce these errors as much as we can by identifying them and trying to reduce them by picking the error that causes the least bias.

To clarify further - A clinical idea (e.g. migraine) may have more than one cohort definition . i.e. we can model the target (clinical description) in many valid ways. Each model may have different performance characteristics (i am using the word error to define performance characteristics) like sensitivity, specificity, index date misclassification. I think we can define these errors/performance characteristics during phenotyping and independent of use case.

e.g. a cohort with H/O migraine code as entry event criteria may have higher person count but poor index date, while a cohort with H/O migraine code excluded (in inclusion rule) may have lower person count (sensitivity) but better index date. Its a trade off.

If the history of code is not present in a data source, then the two cohorts would probably have the same performance characteristic in that data source and this code did not make any difference. But if the code is present in large numbers, then it would make an impact (like @Daniel_Prieto data source).

So if we are doing a network study on 10 data sources, and in 9 data sources the code is practically not present - and in 10th data source it is present AND sensitivity errors is the error we want to reduce the most: then it is reasonable to use the cohort definition with H/O code.


  1. We can have more than one cohort definition per clinical idea (we can call them phenotype flavors)
  2. But we need to describe the errors/performance characteristics over network of data sources during phenotyping.
  3. Then during study, we pick the errors that is going to cause the least negative impact on our study (i.e. least bias)
  4. i dont like the terms ‘prevalent cohort’, ‘incident cohort’, ‘exposure cohort’, ‘outcome cohort’, ‘feature cohort’ - because it is more ambiguous idea than the to talk in terms of errors. These words are being used but not everyone has a clear shared understanding of what we mean by it.