OHDSI Home | Forums | Wiki | Github

Phenotype definition - T2DM - Help needed

Hello Everyone,

I came across the T2DM Phenotype algorithm in PheKB website click here and have few questions.

I am a graduate trying to learn what are phenotypes and how it works etc. Not from healthcare background. So would really appreciate if domain jargons are minimized

Can you help me with the below questions?

Let’s breakdown the flowchart (pasted below) into individual paths

path 1) EMR -----> T1DM DX -----no–> T2DM DX ----no—> RX T2DM MED —yes—> Abnormal Lab —yes----> CASE

Q1) - How can the patient be prescribed for T2DM meds when he is not diagnosed for T2DM?

path 2) EMR -->T1DM DX —no—> T2DM DX ----yes—> RX T1DM MED —no—> RX T2DM MED —no----> Abnormal Lab —yes----> CASE

Q1) Based on path 1 and path 2 what I can infer is, if the patient has abnormal lab, he is a T2DM patient irrespective of whether he was prescribed or not for RX T2DM meds

path 3) EMR ----->T1DM DX —no----> T2DM DX —yes—> RX T1DM MED —yes—> RX T2DM MED ----yes—> T2DM RX precedes T1DM RX —yes—> CASE

Q1) Again, how can a patient be prescribed with T1DM meds when is not diagnosed for T1DM ?

Q2) It makes sense to see T2DM meds as patient was diagnosed for T2DM but why do we see it as T2DM RX precedes T1DM RX mean? Shouldn’t it be T1DM RX precedes T2DM RX? Because when I did a google search, I found out that T1DM` usually occurs before once enters adulthood. So Am I misinterpreting this?

path 4) EMR ----->T1DM DX —no----> T2DM DX —yes—> RX T1DM MED —yes—> RX T2DM MED ----no—> T2DM DX by Physician >=2 —yes—> CASE

Q1) In this path, a patient is diagnosed for T2DM but he has been prescribed T1DM meds. Is this possible and when can this happen?

Q2) Similarly, though the patient is diagnosed for T2DM he isn’t prescribed for T2DM Meds. Is it like patients are prescribed meds only after multiple visits or something?

Can you help me with the above questions please? Will really be helpful



Hi, @Akshay,
Good questions. I can’t comment on the clinical aspects of your questions (such as in what clinical context will someone be diagnosed with T2DM or but left untreated (your Q2 under path 4).

However, what I can comment on is that observational data is not perfect. You could have a person who was diagnosed young with T1DM, switches healthcare providers, and continues being prescribed T1 meds without the diagnosis recorded. Perhaps the Phenotype definition is designed to tolerate missingness of data, (ie: isn’t it enough to see a T1DM medication perscription without the actual T1DM diagnosis, if something further in the logic path further restricts the population based on the presence of T1DM meds?

You also picked up on a very good approach for translating a flowchart into boolean logic for a cohort definition. Once you lay out each of those paths, you can actually roll up some of those logical constraints at a higher level to simplify the branches into a smaller group of OR’s (ie: A AND (B OR C OR D) where A in this case is ‘no T1DM Diagnosis’.

I hope others in the community can answer the more clinical focused elements of your questions. I am sure there is a lot of expertise that can give you an informed answer.

1 Like

Thanks for the response Chris. Much appreciated. Will wait for someone to help me on those clinical aspects. I am trying to do this in Atlas, but wish to get this clear first.

ps- i did see that there are already multiple cohorts based on above definition in Atlas.


The clinical aspects: Essentially you are asking why can T2DM patients be treated with T1DM drugs and the other way around, or not treated at all.

Reason is simple. The medication is overlapping. T1DM has practically no insulin production right from the beginning, while T2DM is is characterized by a slow deterioration of insulin production or effect over time. So, at the beginning of a T2DM when it is still mild you can give all those oral antidiabetic drugs boosting the remaining system, or even get by without drug treatment at all if the patient looses weight and drastically changes life style. But in advanced stages the T2DM is treated identically to T1DM: with insulin.

So, the logic of the T2DM phenotype algorithm tries to do three things: (i) establish that there is a diabetes, (ii) distinguish it from T1DM, and (iii) show flexibility with some mild misclassification. It is based on three pieces of evidence: the diagnostic code, the medication and the lab test.

(iii) is the nasty one here: It assumes that the doc might have it wrong during initial presentation, but when insisted more than once you’ll believe it. Pretty wild, isn’t it? Why not three times? Or four? This is called “tacit knowledge of the data” and it is what makes these deterministic phenotype an art, rather than a science. We need to overcome that and create data-driven robust phenotype definitions that take into account the error in each of the various pieces of evidence. Lots of work to do.

1 Like

Hi @Christian_Reich - While I was trying to implement this, I encountered a scenario where the diabetes type is undetermined. I mean few patients’ condition records have entries such as “eye disorder due to diabetes mellitus” , “kidney disorder due to diabetes mellitus”,“Diabetic complication” or something like “multiple complications due to diabetes mellitus”. Because of which the algorithm doesn’t identify these patients either as T1 or T2 and we lose performance.

In this case, before we apply phenotype algorithm, we need to make sure to address this issue by looking at lab test values (HbA1c, FG, RG) and decide whether these are T1 or T2DM? for ex: HbA1c > 6.5 is T2DM else T1DM. But is this the right way to determine diabetes type? quite confused

Yes. Those diseases of the eye, kidney or the other complication are separate distinct diseases. They are caused by diabetes, no matter which one - the effect is the same whether it is type I or type II. For you that is an indicator that there is diabetes going on, otherwise they wouldn’t have the complication. But you still have to identify the nature of the primary condition. The complication does not tell you which one of the two diabetes types were the cause.

If you have no information about the actual diabetes diagnosis in the data I would suggest dropping that patient. To infer the distinction between type I and II based on the structured data will be hard: Both diseases have very similar phenotypes. The lab tests are similar. The only difference is that T1 patients have auto-antibodies, and T2 patients tend to have obesity (high BMI), lack of physical activity and poor diet. The doctor can make all these determinations very easily, you probably cannot.

1 Like

We used to differentiate between Type1 and Type2 diabetes based on the age at first diagnosis since type 1 diabetes happens rather early and type 2 diabetes rather later in life. But that is just a proxy saying that anyone who is diagnosed above age 30 or 40 likely has type 2 diabetes for example. but with obesity increasing at young age, that proxy may not be a good proxy any more :wink:
Anyway, this also implies that you need a long patient history to make sure it was really the first diagnosis that you see which is not always the case when using claims or only seeing hospital stays of a patient.

1 Like

Hi @Akshay and all! did you finally implement this algorithm using OMOP CDM data? did it work well? I am currently looking for a phenotype that distinguishes T1D and T2D in the setting of OMOP. All algorithms that I know are based in codes, treatments and age at diagnosis and there are a few out there in the literature, but I can’t find any in OHDSI github/forum that has been used in OMOP CDM (most probably because I am not looking well enough)

Hi @david_vizcaya,

Actually I also tried running this algorithm on our CDM data and it didn’t work well. It mis-classified lot of subjects.

Clinicians when reviewed the rules also felt that it is too rigid and cannot really work

1 Like

Thanks @SELVA_MUTHU_KUMARAN! To your knowledge, is there any alternative phenotype that works well in CDM?

Hi @david_vizcaya,

Actually for T1DM, I couldn’t find any algorithm that could identify accurately.

But for T2DM, I used the below algorithm in our CDM and it was better than PheKB T2DM (identifying T2DM patients only) based on our experience of using it in our dataset. You can see whether it helps.

Hope this helps.


1 Like

My understanding is that data sources will differ in degree to which T1 is captured as a distinct entity from T2. So the question of best phenotype definition might not have a clear answer apart form the specific sites where it is to be implemented.

1 Like