OHDSI Home | Forums | Wiki | Github

Welcome to Phenotype Phebruary!


It’s finally arrived. That wonderful month when you can put all your troubles aside, cast off those New Year’s resolutions you’ve already failed at, enjoy the freezing cold Northeastern US weather or the Australian heat, and JUST FOCUS ON PHENOTYPING!

28 days, 28 phenotypes. That’s our target for OHDSI in 2022. Are you ready?

For those who missed it, our OHDSI community call recording is here where I tried to provide a little background and motivation for this big community push together.

But to summarize it here: phenotypes are the foundational element in almost every real-world analysis we do in OHDSI, they are the natural bridge between our standardized data (the OMOP CDM) and our standardized analytics (such as ATLAS and the HADES packages). The reliability of the evidence we generate often lives and dies by the quality of the phenotypes that we use as inputs of indications, exposures, outcomes, and other features that we put into our analyses. And yet, across the broader research enterprise, the science of phenotype development and evaluation is relatively immature. The world doesn’t yet have consensus best practices to design phenotypes, doesn’t have agreed standardized tools to build phenotypes, doesn’t have consistent, reproducible methods to evaluate phenotypes. Our phenotypes are fraught with substantial measurement error; we know we likely have suboptimal sensitivity, specificity, and positive predictive value, yet we don’t consistently estimate the measurement error and even more rarely integrate measurement error into our analyses. No large-scale regression or fancy deep learning model is sufficient to solidify the house of cards that our analyses rest upon if we have suspect phenotypes.

And yet, within OHDSI, together we have created the world’s largest open-science community for observational health research, with a distributed data network of >300 databases harmonized to a common data model, collectively representing more than 10% of the world’s population. Together, we have conducted methodological research to evaluate and establish scientific best practices for observational analysis. Together, we have developed open-source analytic tools that make it possible to conduct large-scale analyses for clinical characterization, population-level estimation, and patient-level prediction across our network. Together, we have applied our best practices and tools to generate reliable evidence that has been impactful to the lives of patients around the world. And together, I am confident that we can expand our impact even further by developing a system to generate evidence that characterizes disease and treatment utilization, estimates the effects of medical interventions, and predicts patient outcomes across our network of observational healthcare databases. But to reach for this lofty aspiration, we have to start by building a firm foundation. We need to do the hard-but-necessary work of developing phenotypes for all the health outcomes we wish to investigate, and evaluate those phenotype algorithms across our network so we can build confidence that we understand the performance of those algorithms and can interpret the results of analyses using those algorithms appropriately.

Over the last several years, we have seen a lot of impressive work across our community paving a path to make it possible to take on this task. We have a wide array of community-developed open source tools to support aspects of the phenotype development and evaluation process (ATLAS, CapR, PHOEBE, APHRODITE, CohortDiagnostics, PheValuator…). Now is the time to put everything together with a community effort to build a community resource that can support all of our community analysis activities.

But ‘phenotyping all health outcomes’ may sound a bit overwhelming. Where do we start? During today’s OHDSI community call, I asked the community to share their thoughts on ‘What phenotypes would you like to develop and evaluate together?’, and it was great to see so much active participation. Below is the list of phenotype targets that received at least 5 votes:

  • type 2 diabetes mellitus
  • Alzheimer disease
  • heart failure
  • pulmonary embolism
  • type 1 diabetes mellitus
  • suicidal thoughts or behavior
  • deep vein thrombosis
  • Long COVID
  • non-small cell lung cancer (NSCLC)
  • depression
  • hypertension
  • anxiety
  • acute myocardial infarction
  • cervical cancer
  • multiple sclerosis
  • Guillain-Barre syndrome
  • epilepsy
  • kidney stones
  • hemorrhagic stroke
  • Systemic Lupus Erythematosus
  • prostate cancer
  • diabetes mellitus
  • Parkinsonism
  • autistic disorder
  • rheumatoid arthritis
  • migraine
  • hyperthyroidism
  • cardiomyopathy
  • homelessness
  • Neutropenia
  • ischemic stroke
  • Crohn’s disease
  • psoriasis
  • anaphylaxis
  • Asthma/COPD
  • pregnancy
  • seizure
  • multiple myeloma
  • atrial fibrillation
  • melanoma
  • acute hepatitis
  • Triple Negative Breast Cancer
  • acute hepatic failure
  • attention deficit hyperactivity disorder
  • ulcerative colitis
  • bipolar disease
  • peripheral artery disease
  • HIV
  • gastrointestinal bleeding
  • hypoglycemia
  • pulmonary arterial hypertension
  • hepatic cirrhosis
  • NASH (Non-Alcoholic SteatoHepatitis)
  • Acute Kidney Injury

Phenotype Phebruary. 28 days. 28 phenotypes. Our daily Task: Given a phenotype target, create a clinical description, review prior work, develop a cohort definition(s) using OHDSI tools (like PHOEBE, ATLAS, APHRODITE), evaluate using OHDSI tools (like CohortDiagnostics, PheValuator), write a summary of findings.

How can you get involved in Phenotype Phebruary?

  1. Join the conversation!
  • Discussions will be here on forums.ohdsi.org
  • Each day will be a new thread
    – Ex: Look for: “Phenotype Phebruary Day 1 – Type 2 diabetes mellitus”
  • Explore the definitions and review the results provided
  • Reply with your thoughts, reflections, insights and questions
  1. Evaluate the cohort definitions in your data!
  • Execute cohort definitions and CohortDiagnostics in your CDM
  • Share insights you learn from your data on the forums
  • Share results to compile across the network on data.ohdsi.org
  1. Lead a discussion!
  • I plan to get the ball rolling by leading the discussion for the first 7 days, but if others would like to similarly lead a phenotype development and evaluation activity, contact ryan@ohdsi.org or chat with me in OHDSI MSTeams, tell me your desired phenotype target and calendar date you want to commit to.

Let’s go!


Thank you for leading this effort @Patrick_Ryan

As a community we need to

To get access to the Atlas please sign up here https://forms.gle/6fxcZFyufhL39pLj7

Lot to accomplish this month!