OHDSI Home | Forums | Wiki | Github

Phenotype Phebruary Day 2 - Type 1 diabetes mellitus

Welcome everyone to Day 2 of Phenotype Phebruary! I hope you enjoyed reading the kick-off to the discussion of phenotyping Type 2 Diabetes Mellitus on Day 1, and encourage you to join that conservation. Meanwhile, here, I hope to stimulate another discussion, this one on Type 1 Diabetes Mellitus (T1DM).

Now admittedly, I wasn’t planning to consider T1DM as a phenotype to work through during the month, because I thought it might be too close in spirit to T2DM. However, the community spoke loudly in their voting, with 20 individuals asking to explore T1DM, putting it in the top 5 of desired targets, so here we are.

And since we did T2DM yesterday, I figured today’s a good opportunity stay in this related space, but highlight some different insights and observations that arise from going through the OHDSI phentoype development and evaluation process.

Clinical description:

As with T2DM, we can look to the American Diabetes Association (ADA) “Standards of Medical Care in Diabetes” to provide a helpful frame of reference for the disease. ADA classifies diabetes into "the following general categories:

  • Type 1 diabetes (due to autoimmuneb-cell destruction, usually leading to absolute insulin deficiency, including latent autoimmune diabetes of adulthood)
  • Type 2 diabetes (due to a progressive loss of adequate b-cell insulin secretion frequently on the background of insulin resistance)
  • Specific types of diabetes due to other causes, e.g.,monogenic diabetes syndromes (such as neonatal diabetes and maturity-onset diabetes of the young), diseases of the exocrine pancreas (such as cystic fibrosis and pancreatitis), and drug- or chemical-induced diabetes (such as with glucocorticoid use, in the treatment of HIV/AIDS, or after organ transplantation)
  • Gestational diabetes mellitus (diabetes diagnosed in the second or third trimester of pregnancy that was not clearly overt diabetes prior to gestation)"

Here, I find the ADA’s discussion about the evolving landscape of T1DM and T2DM quite informative:

“Type 1 diabetes and type 2 diabetes are heterogeneous diseases in which clinical presentation and disease progression may vary considerably. Classification is important for determining therapy, but some individuals cannot be clearly classified as having type 1 or type 2 diabetes at the time of diagnosis. The traditional paradigms of type 2 diabetes occurring only in adults and type 1 diabetes only in children are no longer accurate, as both diseases occur in both age-groups. Children with type 1 diabetes typically present with the hallmark symptoms of polyuria/polydipsia, and approximately one-third present with diabetic ketoacidosis (DKA). The onset of type 1 diabetes may be more variable in adults; they may not present with the classic symptoms seen in children and may experience temporary remission from the need for insulin…It is important for the provider to realize that classification of diabetes type is not always straightforward at presentation and that misdiagnosis is common (e.g., adults with type 1 diabetes misdiagnosed as having type 2 diabetes; individuals with maturity-onset diabetes of the young [MODY] misdiagnosed as having type 1 diabetes, etc.). Although difficulties in distinguishing diabetes type may occur in all age-groups at onset, the diagnosis becomes more obvious over time in people with β-cell deficiency.”

The epidemiology and disease natural history of T1DM is well characterized. Initial diagnosis is commonly in children and young adults, and more prevalent in males than females. Most patients with Type 1 diabetes require insulin therapy. Complications associated with insufficient management can include acute hypoglycemia, diabetic ketoacidosis and hyperosmolar coma. Incidence of T2DM varies globally, with highest rates in Scandinavia and lowest rates in Asia.

Cohort definitions:

Here, I’ll present two cohorts. The main point to highlight here is that these cohorts were created in ATLAS by copying the T2DM cohorts from yesterday, and simply swapping out the conceptsets in the entry event with the conceptset in the inclusion criteria. So, total time to build these cohorts = <2 minutes. Behold the beauty of re-usable components! No wheel reinvention necessary!

#1 (which will be listed in CohortDiagnostics as C5): ‘Persons with new type 1 diabetes’, based on earliest event of a condition occurrence of a ‘Type 1 diabetes mellitus’ concept (ATLAS-phenotype link here; with a reminder that if you do not yet have access to this ATLAS instance, simply fill in this form)

#2 (which will be listed in CohortDignostics as C4): ‘Persons with new type 1 diabetes and no prior T2DM or secondary diabetes’, based on the earliest event of a condition occurrence of ‘Type 1 diabetes’ concept, but also imposing two inclusion criteria that there are 0 condition occurrences of either a ‘Type 2 diabetes’ concept or a ‘secondary diabetes’ concept (ATLAS-phenotype link here).

So, I’ll use this post to take a little detour to discuss the conceptset expression for ‘Type 1 diabetes mellitus’ and how we can use PHOEBE to develop and evaluate the conceptset.

The conceptset expression is quite straightforward, using 3 concepts + their descendants: ‘Type 1 diabetes mellitus’, ‘Type 1 diabetes mellitus uncontrolled’, and ‘Disorder due to type 1 diabetes mellitus’.

That conceptset expands out to 94 standard concepts, which I show in ATLAS sorted by the record count in descending order to give a sense of which concepts appear most commonly across the OHDSI data partners who contributed to the Concept Prevalence study that @aostropolets led. We can see that, while there are 94 standard concepts to consider, the top 10 comprise the overwhelming majority of the data.

PHOEBE is a recommender system, developed by @aostropolets . It has two primary functions: 1) recommend an initial concept to start your conceptset building activity, and 2) recommend additional concepts to consider based on your current list of concepts. You can access PHOEBE at : https://data.ohdsi.org/PHOEBE/

Let’s see what PHOEBE does when we try to find an initial starting concept:

As a starting point, providing the string ‘type 1 diabetes’ found us a starter concept of ‘Type 1 diabetes mellitus’, and we can see that this concept has 82m records, was directly observed in 20 different databases, and using this concept plus its descendants would yield >102m records across 21 databases. Sounds like a great place to start

If we move to the second function in PHOEBE, we can see what is recommended to us if we use only that starter concept:

PHOEBE suggests that this concept alone isn’t sufficient. It identified concepts that are not included but recommended because of lexical similarity to the standard concept name or source code name (ex: ‘Type 1 diabetes mellitus uncontrolled’), not included but recommended because they are descendants of a concept in your list (ex: ‘Type 1 diabetes mellitus without complication’), not included but recommended because they are parents of a concept in your list (ex: ‘Diabetes mellitus’). I review this list of recommendations to see what I missed, and iteratively add those concepts to my conceptset until I do not find any additional recommendations useful for my clinical idea of interest.

Cohort evaluations:

I’ll again use CohortDiagnostics to evaluate these two definitions. The results are publicly available at: https://data.ohdsi.org/phenotypePhebruary/

When we look at the Concept Counts, we can see that the additional inclusion criteria, restricting prior T2DM or secondary diabetes codes has a substantial impact on the number of qualifying persons in each cohort. In CCAE/MDCD/MDCR, we see >80% of patients were lost, while in CPRD and Iqvia France attrition was ~66%, in Iqvia Germany it was >50%, and in Iqvia Australia, it was >30%.

Now, let’s look at the Incidence Rate analysis:

Here is where we should start getting worried, at least for some databases. In the first row, we see CPRD, and we see that the incidence is highest in the 0-9 and 10-19 age groups, with a higher incidence in males vs females, so that seems plausible with what we read from ADA. But the second row is Iqvia France, the third row is Iqvia Germany, the fourth is CCAE, and in these cases, you see the ‘incidence’ is relatively constant across age groups, that defies what we expect.

Let’s think about what could cause this? A couple possibilities: 1) perhaps these are persons with an initial misdiagnosis of T1DM followed by T2DM diagnosis and management. 2) Maybe we have misspecification of ‘incidence’ because a 365d prior observation period requirement is too short (since T1DM onset could have been much earlier in a person’s life and may not be re-coded recently), so we are really looking at prevalence cases with index date misspecification.

One of the features of CohortDiagnostics that I didn’t showcase yesterday, but comes in handy in this circumstance is Cohort Overlap. When you have two cohorts and wonder if persons belong to one group or the other or both, this analysis provides useful insight. Recalling back to the ADA description, it is common for misdiagnosis between T1DM and T2DM. So, overlap can help us understand the extent to which qualifying patients in each of the broad cohorts intersect:

Here, we see that across the 5 databases shown, from UK, Germany, and three US databases, that the majority of persons are T2DM only (blue), and some persons are T1DM only (red), but all databases have a reasonable fraction of persons belonging to both (purple). The hoverover on MDCD specifically shows that 74% were T2DM only, 17% were T1DM only, and 8% were in both T2DM and T1DM cohorts. This is useful information, because if you recall, our alternative definitions impose inclusion criteria to remove patients with prior T1DM amongst the T2DM definition, and prior T2DM amongst the T1DM definition, and that means these patients with both diagnoses are not represented in either definition. So some diagnosis mixture definitely exists, but not enough to explain the incidence patterns above.

So, let’s turn to Temporal Characterizations to see if we can find any insights of specificity errors or index date misspecification. I’ll start by looking at diabetes-related conditions occurring in our C4 cohort:

Yikes! In CCAE, 7.8% of our T1DM cohort gets a ‘Type 2 diabetes mellitus without complication’ diagnosis within a month of the T1DM diagnosis, and 20% have this code within the year. Another 5% have a ‘Diabetic-poor control’ or ‘Diabetes mellitus without complication’ concept (which are not T1DM specificity). So, this is some evidence that maybe all our T1DM patients aren’t actually T1DM.

Let’s look at insulin use; after all, if they are real T1DM patients, many should all have insulin therapy after diagnosis.

Yikes #2! If most T1DM patients should have insulin, then why do we see in CCAE that <10% of any insulin product is observed in a year? Even if I ignore double-counting and added up these percentages, its <50%. In fact, the prevalence of metformin (recommended first-line treatment for T2DM) is in-line with insulin glargine. So, either our dataset is missing insulin exposure (in CCAE, probably unlikely, since it’ll be handled well from retail pharmacy as prescription dispensing covered by private insurance) or these persons aren’t on insulin because they don’t have T1DM. Either way, I wouldn’t be comfortable using this definition in the US claims data without further evaluation.

To contrast this, CPRD has a incidence trend that made more sense, so what does the insulin exposure pattern look like there?

Well, that looks much better! We’re seeing >50% with an exposure to ‘insulin aspart, human’, with considerable exposure to other insulin products. This gives me much greater confidence that our T1DM diagnosis codes can find T1DM patients in our UK EHR. However, before I get too excited, this same table also shows me that our phenotype may not be done yet. Notice that there’s ~5% of persons with insulin exposure of a few products in the year PRIOR to the T1DM index date. That’s screaming of index date misspecification. We might want our cohort to look for the earliest of a T1DM diagnosis OR insulin treatment, with an inclusion criteria requiring T1DM diagnosis on or within some interval after the entry event (similar to the extra variant we built for T2DM that looked at drug use and measurement values).

What have other people found when trying to phenotype T1DM? There was a nice paper in BMC last year that attempted to develop and evaluate alternative algorithms in Japan claims data. The authors cite emulating PheValuator (an OHDSI tool developed by @jswerdel that will likely be a subject of a future Phenotype Phebruary post) in addition to doing chart abstraction, and they found in their data that a diagnosis code alone had low sensitivity and poor positive predictive value. They were able to increase PPV by imposing additional requirements, but that comes with further risk of decreased sensitivity. Their bottom line conclusion, "As a result of the performance evaluation of the case definitions for T1D, it was suggested that the ICD10 code of T1D should not be used for assessing the true patients with T1D. "

So, this is a nice problem for our community to chew on some more. What would be your next step to develop alternative definitions that may overcome the issues we’ve identified?

This is so useful @Patrick_Ryan! thanks so much!
One thing that comes to my mind -that, honestly, I have never implemented in a cohort definition in Atlas- is using a sequence of treatments as an inclusion criteria. It is not so rare to observe use of insulin in patients with other types of diabetes besides T1, the difference is that in T2D, for example, insulin should ALWAYS come as quite a last resort (3rd o 4th line). I guess that this may rule out some of the unspecific CONDITION_OCCURENCE codes into specific type of diabetes. Still there is a problem with persons with T1D that get no insulin prescription recorded. One question that comes to my mind is why don’t you use the T2D phenotype definition to rule out T2D in your C4 cohort here? Specifically, What I am missing is the use of T2D medications in the ruling out of T2D.

1 Like

Hey @david_vizcaya ! Yes, we could absolutely consider using treatments as a mechanism to distinguish T1DM and T2DM, but we need to recognize that this itself will likely come with some misclassification.

Candidate alternatives we could consider along this vein, in order of increased specificity and decreased sensitivity:

  1. (entry event: T1DM or insulin) with at least 1 occurrence of insulin after and at least 1 T1DM diagnosis after
  2. (entry event: T1DM or insulin) with at least 1 occurrence of insulin and at least 1 T1DM diagnosis AND 0 T2DM diagnosis before
  3. (entry event: T1DM or insulin) with at least 1 occurrence of insulin and at least 1 T1DM diagnosis AND 0 T2DM diagnosis before or after
  4. (entry event: T1DM or insulin) with at least 1 occurrence of insulin and at least 1 T1DM diagnosis AND 0 T2DM diagnosis before or after AND 0 T2DM medications before or after

I’ll also comment, there are a wide array of algorithms that have been applied in the literature. One approach that’s been picked up in a couple subsequent studies looked at counting the number of T1DM diagnoses and the number of T2DM diagnoses, and classifying as T1DM if the diagnosis count was higher. I struggle with thinking about the generalizability of an approach like that, because it is largely tied to coding practice. So, your idea of trying to use treatment as a guide can be very helpful (as long as we are careful to consider the contamination that will introduce).

1 Like

This is really great! I applied the concept set criteria to our data (Astellas Pharma US) and I have a quick question. I noticed that although there are no standard concepts for type 1 or secondary DM in the included concept list, there are codes (i.e. ICD10 CM) for these types of DM in my source code list. I am still learning how to navigate Atlas/create concept sets, so just wondering if the source codes are integral in phenotype creation/affect results. And if so is there a way to address this in concept set building? Thanks so much!

1 Like

One word of caution in using treatments in algorithm development is the potential for immortal time bias as one then needs to have something observed in the future to tell you about their past conditioned state. Sets up a new index date for ‘treated T1DM’ or ‘treated T2DM’. Although treatment is sometimes the best observable ‘validator’ of a condition and for us pharmacists often the only way we know you are a patient. I have dispensed insulin for cash contributing to treatment misclassification.


Hey @Ryan_Kenny , good question. Yes, understanding the source codes that map into the standard concepts is an important and necessary step of the phenotype development process. And you are absolutely right that sometimes, source codes that may seem appropriate for a given clinical idea may map into a standard concept that seems inconsistent.

In ATLAS, you can navigate using the ‘Search’ tab to any concept of interest, here I searched for ‘Diabetic ketoacidosis’ and selected the standard concept 4009303. From the ‘Related concepts’ tab, I can filter to ICD10CM and relationship = ‘Standard to Non-standard map (OMOP)’, and we can see there are 5 ICD10CM codes that map directly to Diabetic ketoacidosis without coma, as show below:

We can see there is one source code, ‘E10.10 - Type 1 diabetes mellitus with ketoacidosis without coma’, which we would probably want to consider under T1DM, and 4 codes (E13.10, E11.10, E08.10, E09.10) which we’d probably all agree are not in T1DM. So, in this case, where we have multiple source codes mapping into the same standard concept, then we have a decision to make: if we want to maintain a standardized analysis, do we include this concept in our definition of T1DM (which may increase sensitivty but decrease specificity) or do we exclude it (potentially decreasing sensitivity but increasing specificity)? Or do we decide to create a source code based definition to pick-and-choose amongst these ICD10 codes, recognizing that devolving to a ICD10CM-specific deifnition will not be applicable to other datasets that use other source codes (including the very same US claims dataset you are working on, which has ICD9CM codes pre-Oct 2015).

In this specific case, you have to dig a bit further. If you track the mapping of source code E10.10, you’ll see that it has two mappings to standard concepts, and lo and behold, the other mapping is directly to ‘Type 1 diabetes mellitus’, meaning that you will get what you want with the entry event definition that we originally specified:

Thanks @Kevin_Haynes , this is a very good topic to discuss.

When I usually think about ‘immortal time bias’, it is in the context of either an incidence rate characterization or a population-level effect estimation study, where I have some ‘time-at-risk’ following some target cohort ,and I’m looking for outcomes. Here, the potential bias if I count time-at-risk for a given person even though they are ineligible to be observed with the outcome. So, a concrete example, If I wanted to estimate incidence of T2DM amongst people entering the database after 2020, any persons who already have T2DM (with a diagnosis code in 2019 or prior) need to be excluded from my characterization, because they cannot have a ‘new’ T2DM post-2020 and therefore have no ‘time-at-risk’.

So, in the case of chronic diseases, like T1DM and T2DM, whereby the general expectation is that once a person enters a cohort, they remain in the cohort through the end of their observation (almost definitely true for T1DM, and mostly true for T2DM, with exceptions of those who have lifestyle behaviors, drop the pounds, and get ther hbA1c under 6.5% without medication support for some extended time), I’m struggling to see the concern of ‘immortal time bias’ by using forward-looking events like medication use as part of the outcome definition.

If we were dealing with an outcome that allows for recurrent events, like deep vein thrombosis or acute myocardial infarction or COVID-19, then I definitely can see a legitimate risk for ‘immortal time bias’, because the cohort logic will require specifying not only how a person enters the cohort, but also how they exit it, and if that logic involves some ‘clean window’ of time between one event and a subsequent event in the same patient’s record, then that ‘clean window’ person is ‘immortal time’, in that you know by definition you can’t observe a new event in this interval following a preceding event.

The CohortIncidence package that @Chris_Knoll developed very elegantly handles this immortal time bias in the context of cohorts that allow for recurrent events. The CohortMethod package, which @schuemie maintains, provides additional analysis parameters (in that case, priorOutcomeLookback) to allow the user to avoid the threat of immortal time bias (as long as priorOutcomeLookback is set to some value greater than the outcome phenotype clean window), But it is quite easy to miss these subtleties if you try to compute an analysis de novo without using the OHDSI standard tools.

1 Like

Using PheValuator, we can test the performance characteristics of the T1DM algorithms. Using this method, I found:

This seems to agree with several of the findings from @Patrick_Ryan 's analysis. These algorithms (the 2nd and 3rd lines in the sets) had fairly low PPV. The addition of more parameters to the second algorithm actually reduced the PPV’s. Sensitivity, on the other hand, was high for the first algorithm and low for the second algorithm which matches with the low counts @Patrick_Ryan found for the more complex algorithm relative the simpler algorithm. I included a prevalent algorithm (top line in each set) to again show that the PPV for those subjects with possibly established T1DM are, on average, higher than those with newly diagnosed T1DM.

@jswerdel this is AWESOME! because I think you’ve directly exposed an error in my phenotype that I had not seen, even after reviewing the logic and cohort characteristics.

You are showing a giant drop in sensitivity for ‘Persons with new Type 1 diabetes and no prior T2DM or secondary diabetes’ . At first, I said to myself, “That can’t be right, what did PheValuator do wrong?”. But I should have been looking straight in the mirror and saying, “What did I do wrong?”

And I think at least a little bit of the answer lies in the construction of our T2DM conceptset. Remember, in that post, I described that we had to make choices about how to classify ‘Diabetes mellitus’ codes, which don’t specify either T1DM or T2DM. Turns out many of those ‘non-specific’ Diabetes concepts are quite prevalent. And, since T2DM is ~80% of all diabetes, then in the context of phenotyping T2DM, it makes sense to assume those codes ‘count’ as T2DM (accepting the increasing in sensitivity at a relatively small expense of specificity). But, that same logic can’t be applied with considering a T1DM phenotype…because if a person has a bunch of T1DM codes and a non-specific Diabetes code, we shouldn’t assume that Diabetes code is T2DM, much more reasonable is to assume its T1DM. If we want to impose a T2DM exclusion, we need to restrict only to the T2DM-specific concepts, removing the non-specific diabetes concepts from the criteria.

The problem if misclassification of T2DM is mainly a mess in ICD9CM, because many of the codes were like 250.02, ‘Diabetes mellitus without mention of complication, type II or unspecified type, uncontrolled’, (by allowing combining T2DM or unspecified, it creates ambiguity)

To check this out, I created a new T1DM definition, but revised the T2DM conceptset as follows:

In CCAE, here’s the attrition table for the original definition:

…and here it is when using the T2DM-specific conceptset for inclusion criteria #2:

So, we can see we increase our sample by 33%, simply by allowing non-specific codes in the definition.

That said, in both cases, we see that excluding T2DM codes is the big driver of patient loss. And clearly, that will have some impact on sensitivity of the T1DM definition (though should help increase specificity). Whether that sensitivity/specificity tradeoff is desirable is something that would be good to deliberate on.

I wanted to try to see why we are losing sensitivity without gains in PPV in our algorithms. I tested 2 new algorithms for T1DM, one with a second dx code for T1DM 0-365 days after the index plus a drug exposure to insulin without requiring a prior 365 day observation period and one that does require a 365 day prior observation period. The results were:

This seems to indicate that we can achieve relatively high PPV and sensitivity for T1DM using these simpler algorithms without excluding T2DM/secondary diabetes codes. In CCAE, a US dataset of those generally under 65YO, we see a PPV of around 80% while maintaining a sensitivity of about 73%.The overall performance can also be assessed using the F1 score which is the harmonic mean of the sensitivity and PPV. The simpler definitions had higher F1 scores than the definition using exclusions for T2DM/secondary diabetes codes. Between the two new algorithms, imposing a 365 prior observation period does reduce the sensitivity with only a small change in PPV.

This is awesome, great stuff Joel. I think this really puts an important point about using empirical results to guide of phenotype evaluation process. Now that I see these PheValuator findings, I certainly favor your revised definition…and I crave to see how that revised definition works across other databases in the OHDSI network.

Do others in the community want to try out the alternative T1DM definitions?