OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary Day 14 - Hypertension (emphasis on clinical description)

Using the concept set expression - i like to build cohorts that i expect would have variable sensitivity and specificity for the phenotype

Id Name Estimated sensitivity Estimated specificity
135 [Phenotype Phebruary][HTN] Essential Hypertension exclude secondary causes *** ***
136 [Phenotype Phebruary][HTN] Essential Hypertension no exclusions **** **
138 [Phenotype Phebruary][HTN] Hypertensive disorder not limited to essential ***** *

Now i am running Cohort Diagnostics - and will come back in a few hours to review results

May I suggest that we incorporate the populational scope of observational research? I totally agree that the anchor point for phenotype development is the clinical definition, but those are definitions that are meant to be used to identify individual patients during one or a series of related healthcare encounters. For populations ascertainment for research purposes, the definitions must be based on the clinical presentation but considering its operationalization at the populational level. For example, asthma is defined in a doctor’s office by means of anamnesis, family history, occupational history, bronchial challenge tests, lung function parameters, etc, etc. One of the most impacting findings of the historic ECRHS cohort studies, was to find out that there was a lot of undiagnosed persons with asthma in Europe (subclinical presentations, lack of awareness, etc) and that was possible because ECRHS developed and validated simple “phenotypes” based on symptoms that allowed higher sensitivity at the population level. Still those “phenotypes” were hardly transferrable back to the GP offices, but the message of increasing asthma awareness did. This situation of how different a clinical definition is (or needs to be) implemented in a populational study or in a doctor’s office can be applied to many other conditions starting with kidney diseases such as AKI or CKD.

@david_vizcaya i think it is reasonable to build phenotype definitions that use signs/symptoms. In the case of this phenotype, hypertension,

people most commonly should NOT have signs/symptoms that are directly attributable to hypertension e.g. hypertensive crisis may present with headache/blurry vision. presence of hypertensive crisis would exclude persons from the phenotype (in this case)

The counts decreased as more restrictions were applied. That makes sense and is inline with prior expectations that sensitivity would go down and specificity may go up.

But the relative impact is pretty small

Most diagnostic appears to show the three cohorts are identical. This is because the vast majority of the people in the cohorts are the same (i.e. almost 98% are the same - this was expected)

Incidence rate diagnostic

Compare Cohort Characterization

But - when the reason why we know our specificity has gone up - is because when we compare the

We observe the cohorts to be very different. In fact the vast majority of people we are excluding had drug treatments at baseline

We see differences in temporal characterization

The point here is –

The people who we excluded ARE DIFFERENT - and based on the clinical description are NOT the people we want to study in our cohort.

By removing those people from our cohort definition + verifying that the people who removed are different from what our clinical description is saying – we are improving our specificity.

At the same time - the number of people we lost (reduction in sensitivity) is very low - because the counts are pretty much the same even after applying the exclusion.

So now we know that - [Phenotype Phebruary][HTN] Essential Hypertension exclude secondary causes(135) - has better specificity to our clinical description compared to the other cohort definition without loss of sensitivity.

More evidence of the difference

This phenotype is not complete - we havent addressed

  1. Index date misclassification
  2. Other indicators of specificity - e.g. recent chest pain, ER utilization, drugs for hypertension etc. that could improve our specificity further.

This is an opportunity for us to collaborate and further refine the hypertension phenotype

This new cohort (id 141) is designed to approximate persons who get newly diagnosed, but starting at around 3 months AFTER the initial diagnosis are never observed to have care for Hypertension at anytime in the future (i.e. during observation period). Note: i made a mistake and set the time period to start at 120 days instead of 90 days - i will fix that in a future version (but i think we an continue to make the point here)

As we can see in the counts - we have a significant number of people who meet this definition!

And we see differences in the future time windows - using compare temporal characterization

So the two cohorts are different

so - to my point earlier - i think we can learn from the systematic differences and develop an evidence based/empirical approach to define cohort end date.

I have run out of steam for hypertension - see you in Acute Myocardial Infarction phenotype

The main point of this whole thread is the importance of a clinical description. By ensuring we have a shared (written) understanding - it created a target for us to develop phenotype. We could then review the characteristics of the population (using cohort diagnostics) to evaluate our phenotype.

In the absence of such clinical description, written upfront, i would argue that we would have a moving target and would definitely not be able to evaluate any phenotype and adjudicate if the cohort identified honestly represents the clinical description.

Hi Gowtham, my initial reaction is that the structure in the Minimum content is very very helpful as well as the differential diagnosis. However the other sections in the Additional content might be already included in the other sections:

Patient factors that are NOT expected to occur with the phenotype of interest:
if I understand correctly this is trying to capture non-clinical rule out criteria, thus similar to the differential diagnoses, maybe worded “Patient factors that are expected to NOT occur”… but it’s very difficult to define something saying what it is not, its a never ending list. There might be things that are very prominent like gender in prostate cancer but this will be captured in the exclusion criteria, we should not try to capture all the information that’s needed for a cohort definition in the clinical definition.
And I agree with @david_vizcaya`s post

that we should incorporate the operational definition of the disease at the population level in the clinical definition, thus not focus only in the individual case. Epidemiologists have done a lot to transfer the clinical knowledge to define diseases/conditions at population level, when this information is available it should be included.

Patient factors that are expected to occur with the outcome of interest: are probably captured by the Assessment (section in the minimal definition) by the clinician trying to confirm and rule out differential diagnoses

I think this might be too broad and subjective but it’s good to have a place to put the rest of information not fitting the other sections :wink:

In summary, I think the following categories would be enough for a clinical definition: Overview, Presentation, Assessment, Plan and Prognosis, differential diagnoses and other relevant information

Thanks @Marcela for adding clarity to my point! You brought up very good points about the phenotyping process.

@Gowtham_Rao I was not trying to mean that we should incorporate symptoms or signs in any of our phenotypes, but that our phenotypes although anchored in the clinical definition must aim to adapt it to the populational reality becoming therefore useful for observational health research even if it is not mirroring the clinical definition word by word and step by step. We need to keep in mind that clinical definitions are created to help healthcare practitioners to identify patients and discriminate diseases among them. What we are aiming to do here is to identify populations. The example on the ECRHS study was trying to illustrate how the fact that they used definitions that were not mirroring but adapting the clinical definition of asthma helped them in generating unprecedented progression in the understanding of asthma and its different phenotypes . My intent was rather to inspire the phenotype community effort to considering the populational nature of our research when circling the phenotypes back to the clinical definition.

Good discussion! Like both @Marcela and @david_vizcaya correctly pointed out - we are trying to define common characteristics/attributes at the population level.

The way i think about it is that for

  • things we expect to see in the population of interest: we can argue that the cohort definition that has the highest %age of things expected, as in seen characterization diagnostic, is most specific.
  • thing we DONT expect to see in the population of interest: we can argue that the cohort definition that has the highest %age of things NOT expected is least specific.

I think i may have to clarify this idea a little bit more. Maybe it should be written

  • characteristics when present will make the phenotype less likely i.e. we are trying to make sure we reduce the occurrence of these attributes in our population e.g. very young age is less likely to be associated with hypertension.
1 Like

I’m not thinking ‘Healthy’ generically, I’m thinking about the clinical representation of hypertension as a disease state. So, up to this point in the thread, you’ve been using examples that involve diagnosis codes. But hypertension can be clinically diagnosed by blood pressure measurements, and classification based on successive measurements of systolic blood pressure (SBP) > 140 mm Hg or diagnostic blood pressure > 90 mm Hg. So, pivoting to thinking about a cohort definition, I could imagine that looking for SBP measurements may be an entry event - potentially qualifying on its own or else at a minimum to correct for index date misspecification of the first diagnosis date.

But imagine a person is observed with SBP > 140 repeated times, is diagnosed with essential hypertension, is treated with some antihypertensive drug. At some future state, one may observe that SBP is ‘controlled’ with medication management…I would consider that continued use of antihypertensive medications would be indicative of remaining in the hypertension disease state (so not exiting cohort yet). But imagine at some future time, we see there is no longer any hypertensive medications and we see repeated measures with SBP > 140 mm Hg (may or may not have observed lifestyle changes in diet or weight loss or other factors that could potentially explain the blood pressure decrease) . In this case, months after stopping hypertensive drugs with multiple repeated measures of SBP and DBP in normal range, does the person belong to the ‘hypertension’ disease state? Or did their cohort era ‘end’ and person leave the cohort (allowing for the possibility that they will re-appear if at future time we see conditions, measurements, or drugs indicating hypertension again).

Re : index date misspecification for hypertension:

  1. I think we should look for measurement values for systolic and diastolic blood pressure, to use as entry events (and then the question is whether repeated SBP/DBP measures is sufficient to qualify a person or whether we want a HTN diagnosis as an inclusion criteria). I’m curious about others thoughts on this, given that there can be some white coat hypertension observed if we use measurements (but not using them seems really wrong too, since its the basis of the diagnosis)

  2. It would seem logical to consider antihypertensive treatment as a good marker to consider to correct for index date misspecification. But some antihypertensive drugs may have other cardiovascular indications (such as beta blockers and AMI), so I wonder what folks think about how to incorporate medication use (and whether we need to then have some extra exclusion for the alternative indications).

If the successive repeated measures are high that would be uncontrolled BP and a criteria for Htn diagnosis.

Yes, that person would exit. This is an easier scenario because we are observing normotensive BP without treatment.

So, a challenge we have as it relates to observable data is that we can define a cohort definition with a exit strategy that ‘right-censors’ after we see two successive BP without treatment, but if a data source doesn’t have BP measures, we may not observe when a person leaves the disease state (I don’t think that not seeing drugs or conditions would be sufficient to declare that the hypertension as ended). This could represent a form of ‘cohort end date misspecification’ error.

It may be wrong for other more acute phenotypes, but essential hypertension we may not need/be able to get precision in start date. Most clinicians will not start treatment on the date of diagnosis (unless there is hypertensive urgency or crisis).

Sure. But if we are observing ppl on anti hypertensive medications they are not new essential hypertension.