OHDSI Home | Forums | Wiki | Github

OHDSI Phenotype Phebruary 2024 and workgroup updates

There is no meeting on 7/26/2024

Cancellation notice: OHDSI Phenotype Workgroup meeting on 7/26/2024 is cancelled.

Hi all,
The CIPHER-OHDSI Integration Pilot meeting is cancelled today 8/8/24. Offline, we are working to upload the first pilot phenotypes to the CIPHER website (CIPHER - VA).
Best,
Jackie

Hi all,

Please see the agenda for our CIPHER-OHDSI Integration Pilot meeting tomorrow at 2pm EST.

  • Review mapping from OHDSI to CIPHER metadata fields and next steps for upload
  • Discuss possible AMIA Informatics Summit abstract
  • Discuss OHDSI symposium and possible presentation on integration pilot

Here is the call in information:

Microsoft Teams

Join the meeting now

Meeting ID: 235 435 061 844

Passcode: SUfTp4

Dial in by phone

+1 872-701-0185,770508500# United States, Chicago

Phone conference ID: 770 508 500#

Hi all, please see our call notes. We had a productive review of the field mappings between libraries are getting close to finalizing it before adding the initial pilot OHDSI phenotypes to the CIPHER library.

Best,

Jackie

  • GR wrote script to pull fields from OHDSI library and populate into CIPHER metadata collection sheet for API uploads
  • Review of CIPHER–>OHDSI metadata mapping
    • OHDSIcohortid - not represented in CIPHER
      • Used to identify each phenotype in OHDSI library
      • ACTION: CIPHER consider how to incorporate OHDSIcohortid
    • Category - ACTION: GR update in sheet
      • General = non-lab or medication phenotypes
      • Lab = Measurement in OHDSI
      • Medication = Drug in OHDSI
    • Keywords
      • These are used to facilitate phenotype search
      • Currently OHDSI hashtags listed here - these hashtags have accepted meaning within OHDSI, for example “#accepted”, “#level2”
      • ACTION: CIPHER consider how to incorporate hashtags, may leave this section blank for now
    • Disease domain
      • This can be found by browsing the OHDSI hierarchy in ATLAS but there is not an easy mapping from OHDSI to CIPHER categories
    • Contact
      • OHDSI library uses orcid instead of managing emails
      • ACTION: GR create a placeholder email for now
      • ACTION: CIPHER consider integration of orcids
    • Data sources
      • OHDSI may not capture of health system name where partner created phenotype; list OMOP here
    • Phenotype description
      • ACTION: GR update to use cohortnamelong → transform so it’s not in markdown
    • Population description
      • Consider adding standard language here
    • Data used from/to
      • Will populate from OHDSI if available
    • Algorithm description
      • ACTION: GR update to use cohort entry event–> transform so it’s not in markdown
    • ICD codes/algorithm components
      • ACTION: Leave blank; GR to provide spreadsheet with codes used across vocabularies, concept id, concept code, and standard/non standard indicator (OK if ICD lists are long)
      • Note CIPHER is updating algorithm components section on website to add OMOP concept ids
  • OHDSI forum posts
    • Posting is usually the first step in contribution and the post may contain information useful to the following fields:
      • Publication
      • Acknowledgment
      • Population description
    • ACTION: CIPHER consider keeping post URL in one section of metadata and listing standard language about forum post in other fields
  • Version
    • OHDSI phenotype library is version controlled; need to represent version on phenotype page
    • ACTION: CIPHER determine how to represent OHDSI library version
  • Next steps
    • GR to share updated sheet with changes above with CIPHER
    • CIPHER create path for incorporating key fields from OHDSI that do not map to current standard
    • OHDSI/CIPHER agree on final mapping; CIPHER populates pilot phenotypes into library

Thank you @jhonerlaw - here is the updated mapping output

https://ohdsiorg.sharepoint.com/:x:/s/Workgroup-PhenotypeDevelopmentandEvaluation-VACipherandOHDSIPhenotypeLibraryintegration/EYRg8Hj0pOpCt08GcVfU0FsBC0icwex70Z-mEFJajPqajQ?e=5Rsazm

https://ohdsiorg.sharepoint.com/:x:/s/Workgroup-PhenotypeDevelopmentandEvaluation-VACipherandOHDSIPhenotypeLibraryintegration/EVZCXuaWMOpMoRNKtCz8Z1UBf30EHcyyHvrJ9tMbSsiabQ?e=e4ZkLE

Here are our notes for today:

  • Pilot phenotype integration
  • AMIA informatics summit 2025 abstract
    • Frame abstract as a test of phenotype library integration
    • ACTION: Jackie share draft with Gowtham and Azza
    • Abstract due 9/17/24
  • OHDSI symposium preparation
    • Jackie and Anne will attend
    • CIPHER plan for 30 min slot to present and describe collaboration

This topic is temporarily closed for at least 4 hours due to a large number of community flags.

This topic was automatically opened after 24 hours.

OHDSI Phenotype Development and Evaluation Workgroup Meeting - September 13th, 2024

Agenda:

  1. Planning for OHDSI Symposium 2024 Workgroup Meeting

    • Discuss projected attendee numbers and finalize the agenda details.
  2. Scientific Discussion: Phenotype Stability

    • Diagnostics and strategic planning for the upcoming symposium.
  3. Update on the Dermatomyositis Network Study.

  4. ** Measurement error**

  5. VA Cipher and OHDSI Phenotype Library collaboration

Summarized using gpt

Meeting Overview

Date: September 13, 2024
Time: 9:00 AM - 10:00 AM EST
Attendees: Gowtham Rao, Azza Shoaibi, Jacqueline Honerlaw, Jamie Weaver, Joel Swerdel, Christopher Mecoli, Andrew Williams, Hayden Spence, Monika, and others.

Agenda

  1. Planning for OHDSI Symposium 2024 Workgroup Meeting
  2. Discuss Projected Attendee Numbers and Finalize Agenda Details
  3. Scientific Discussion: Phenotype Stability
  4. Diagnostics and Strategic Planning for the Upcoming Symposium
  5. Update on the Dermatomyositis Network Study
  6. Measurement Error
  7. VA Cipher and OHDSI Phenotype Library Collaboration

Key Discussions and Updates

1. Planning for OHDSI Symposium 2024 Workgroup Meeting

  • Symposium Workshop Proposed Agenda (October 24th, 8:00 AM - 1:00 PM):
    • Welcome and Review of OKRs (10 min): Gowtham Rao
    • Method Update 1: Integrating Measurement Error into Study Estimates (20 min): James Weaver
    • Method Update 2: Probabilistic Phenotyping (20 min): Joel Swerdel
    • Expanding and Promoting the Library: Integrating OHDSI Library into CIPHER (20 min): Jacqueline Honerlaw
    • Clinical and Network Studies Update: Dermatomyositis Phenotype Development and Evaluation Network Study (20 min): Dr. Mecoli
    • Method Update 3: Objective Diagnostics (60 min): Gowtham and Azza will run an experiment
    • Next Year Planning: Group Exercise (60 min)

2. Discuss Projected Attendee Numbers and Finalize Agenda Details

  • Joel: Confirms on-site attendance, 20 minutes is good.
  • Chris: Won’t attend in person, but Will, Kelly, and Ben will be on-site. Focus on lessons learned from phenotype network studies vs. clinical insights.
  • Jacky: Confirms on-site attendance with Ann. Plans a quick demo of the site, followed by a review and a fun exercise comparing five phenotypes (OHDSI vs. non-OHDSI - treasure hunt).
  • Jamie: Confirms on-site attendance, 20 minutes is good (asked for 30 minutes).

3. Scientific Discussion: Phenotype Stability

  • Objective Diagnostics for Phenotypes:
    • Focus on two diagnostics: stability over time within a data source and consistency of incidence rates across data sources.
    • Use of statistical tests and visualizations to determine stability and identify significant deviations.
    • Discussion on the need for confidence intervals and the impact of small sample sizes on the results.

4. Diagnostics and Strategic Planning for the Upcoming Symposium

  • Objective Diagnostics Experiment:
    • Plan to run an experiment during the symposium to validate the stability of phenotypes using statistical methods.
    • Participants will review results and validate or disagree with the algorithm’s findings.

5. Update on the Dermatomyositis Network Study

  • Dr. Mecoli: Will provide updates on the study, focusing on the results and insights gained from the network.

6. Measurement Error

  • James Weaver: Will discuss integrating measurement error into study estimates, particularly incidence rates.

7. VA Cipher and OHDSI Phenotype Library Collaboration

  • Jackie Honerlaw: Provided an update on the Cypher integration and the finalization of an abstract for the Immy Informatics Summit.
  • Plans to review and integrate OHDSI phenotypes into Cypher, with a test involving around 25 phenotypes.
  • Future Announcements: Once the 25 phenotypes are tested and integrated, an announcement will be made in the OHDSI community call.

Future Planning and Discussions

  • LLM:
    • As a Focus of Work: Workgroup should consider focusing on LLMs (Large Language Models) for phenotyping.
    • Deep Phenotyping Using Multi-Modal Data: Discussion on using genomics, imaging, waveform, and text data.
    • Azza’s Work Outside the Workgroup: LLM (KEEPER, literature search) - seeking volunteers to educate the group.
    • Models Trained on EHR Data: Hayden’s brought up FEMR models and their potential for phenotyping.
    • Journal Club: Jacky suggested the latest topic of AMIA could be a journal club discussion on LLMs. Monicka offered to lead after having clarity of expectations.

Conclusion

  • Call for Volunteers: Volunteers are needed to help create a system for collaborative evaluation of phenotype stability.
  • Jamie Weaver’s Upcoming Work: A teaser for Jamie’s work on incident rate correction was presented, with more details to be shared in future meetings.

Summary of the Presentation on Phenotype Stability and Objective Diagnostics by Azza Shoaibi and Gowtham Rao

Overview

Azza Shoaibi and Gowtham Rao provided an in-depth discussion on the development and application of objective diagnostics for evaluating phenotype stability. The presentation focused on two main aspects: the stability of phenotypes over time within a single data source and the consistency of incidence rates across multiple data sources.

Key Points

  1. Objective Diagnostics for Phenotypes
  • The goal is to develop diagnostics that can objectively assess the stability and reliability of phenotypes.
  • These diagnostics use statistical methods to evaluate the performance of phenotypes.
  1. Phenotype Stability Over Time
  • Purpose: To determine if a phenotype remains stable over time within a single data source.
  • Method: Utilizes incidence rate diagnostics, plotting incidence rates over calendar years.
  • Visualization:
    • Black Line: Represents the observed incidence rate over time.
    • Dashed Line: Represents the expected trend, modeled using a Poisson spline model with three knots.
  • Statistical Test: A likelihood function compares the area under the observed and expected incidence rate curves. A deviation greater than 25% (ratio > 1.25) indicates instability.
  • Example: The phenotype for pure red cell aplasia showed significant instability in certain data sources, indicating it should not be used across all time periods without adjustments.
  1. Consistency Across Data Sources
  • Purpose: To assess if a phenotype produces consistent incidence rates across different data sources.
  • Method: Compares incidence rates across multiple data sources to identify significant deviations.
  • Future Work: Formalizing a statistical test to evaluate consistency across data sources.
  1. Challenges and Considerations
  • Small Sample Sizes: Variability in incidence rates can be influenced by small sample sizes, necessitating the inclusion of confidence intervals in visualizations.
  • Natural Changes: Phenotypes may naturally change over time due to new guidelines, medications, or coding practices. The diagnostics aim to identify significant deviations that could impact study results.
  1. Future Directions
  • Experiment at Symposium: Plan to run an experiment during the OHDSI Symposium to validate the stability diagnostics using human review.
  • Tool Development: Call for volunteers to help create a system for collaborative evaluation of phenotype stability.

Conclusion

The presentation highlighted the importance of developing robust diagnostics to ensure the reliability of phenotypes used in research. By identifying and addressing instability and inconsistency, researchers can improve the accuracy and validity of their studies. The upcoming experiment at the OHDSI Symposium will further refine these methods and engage the community in collaborative evaluation.

Jacqueline Honerlaw

  • Provided an update on the integration of OHDSI phenotypes into the Cypher system.
  • Mentioned the team recently met to finalize their abstract for the Immy Informatics Summit, due next week.
  • Currently reviewing the initial integration of OHDSI phenotypes into Cypher and plan to test around 25 phenotypes on their test site.
  • Once the review is complete, they will push the phenotypes to production and share the links with the group.
  • Highlighted that after the 25 phenotypes are tested and integrated, an announcement will be made in the OHDSI community call.
  • Expressed willingness to attend the community call to make the announcement once the integration is complete.
  • Suggested using the latest issue of the Journal of the American Medical Informatics Association (JAMIA), which focuses on Large Language Models (LLMs), for a journal club session.
  • Proposed discussing a few articles from the issue if no one has the bandwidth to lead a session.
  • Emphasized the collaborative efforts and the importance of community engagement in these initiatives.

Joel Swerdel

  • Provided insights into probabilistic phenotyping and its progress.
  • Agreed that 20 minutes would be sufficient for his presentation but emphasized the need for discussion time.
  • Suggested that the discussion on phenotype stability might require more than an hour due to its complexity.
  • Inquired about the use of three knots in the Poisson spline model for expected trends.
  • Highlighted the importance of understanding the natural changes in incidence rates over time.
  • Emphasized that while the true incidence rate of a condition might change due to external factors like new guidelines or medications, the phenotype algorithm’s incidence rate should remain stable unless there is a significant deviation.
  • Suggested that the diagnostics should be able to split data into periods to identify when a phenotype is usable.
  • Underscored the need for thorough evaluation and discussion of phenotype stability to ensure reliable research outcomes.

Andrew Williams

  • Emphasized the importance of focusing on computable definitions for all study components, not just health states.
  • Pointed out that much of the current work in the community, as well as in other communities using EHR-based definitions, tends to focus more on characterizing health states rather than the components of care.
  • Argued that a meticulous articulation of all care components, including preconditions, follow-up, and management of identified conditions, is crucial for valid research.
  • Supported Hayden’s suggestion about the need for detailed care pathways and typical care elements.
  • Highlighted the potential of using Large Language Models (LLMs) for phenotyping and deep phenotyping using multi-modal data.
  • Suggested that these areas should be included in future planning.
  • Stressed the importance of understanding the implications of shifts in phenotype stability and the need for consensus on how to handle these shifts.

Kevin Haynes

  • Emphasized the necessity of including confidence intervals in visualizations to account for small sample sizes.
  • Pointed out that small sample sizes can significantly impact the variability of incidence rates.
  • Suggested that without confidence intervals, it is challenging to interpret the observed data accurately.
  • Highlighted that small sample sizes might lead to misleading conclusions about phenotype stability.
  • Recommended displaying confidence intervals to provide more context and help in understanding the true variability.

table { border: 1px solid #c4c7c5; border-radius: 4px; font-size: 16px; } th { padding: 18px 16px; text-align: left; } td { padding: 16px; border-top: 1px solid #c4c7c5; } .katex-mathml{ display: block; text-align: center; } .katex-html { display: none; }

Jamie Weaver

  • Measurement Error: Focused on integrating measurement error into study estimates, particularly incidence rates.
  • Time Allocation: Suggested extending the discussion from 20 minutes to 30 minutes for a more in-depth exploration.
  • Meta-Analytic Incidence Rates: Proposed creating a plot where each incidence rate on the fluctuating line is a meta-analytic incidence rate from multiple databases, fitting a smoothed curve across all data.
  • Phenotype Stability: Highlighted that this method could help identify real incidence rate variations and improve the understanding of phenotype stability.
  • Ongoing Work: Mentioned his ongoing work on incident rate correction, which he plans to apply to phenotypes developed during Phenotype February.
  • Community Engagement: Offered to provide a teaser of this work to interested participants after the meeting, indicating his commitment to advancing the methodology and engaging with the community for feedback and collaboration.

Hayden Spence

  • EHR Data Models: Discussed the potential of models trained on Electronic Health Records (EHR) data for phenotyping, specifically mentioning the FEMR models from Stanford’s Shah Lab.
  • Predictive Models: Highlighted that these models, although not necessarily Large Language Models (LLMs), are trained to interpret healthcare records as if EHR is the language, using OMOP as a framework.
  • Patterns in Patient Care: Emphasized the importance of understanding how patients move through the healthcare system, suggesting that certain events and patterns in patient care can be indicative of specific conditions.
  • Adaptation for Phenotyping: Pointed out that while these models are currently predictive, they could be adapted for phenotyping by identifying expected patterns of care for conditions like pneumonia or heart disease.
  • Stability Diagnostics: Suggested that the stability diagnostics could be applied not only at the population level but also within individual patients over time.
  • Consistency Within Patients: This approach could help identify whether a phenotype remains consistent within a person, which is crucial for conditions that may have fluctuating diagnoses.
  • Advanced Models and Care Pathways: His contributions underscored the potential of advanced models and detailed care pathways in improving phenotype stability and reliability.

Christopher Mecoli

  • Phenotype Instability: Acknowledged that there would likely be instability in phenotypes over time, which is not necessarily a negative outcome.
  • Expected Changes: Explained that such changes could be expected due to new guidelines, medications approved by the FDA, or other factors.
  • Data Interpretation: Emphasized that these changes are important to be aware of as they inform how data should be interpreted.
  • Understanding Instability: Highlighted that understanding the reasons behind phenotype instability is crucial for accurate data analysis and interpretation.
  • Research Validity: His contributions underscored the need to account for these variations when conducting studies to ensure the reliability and validity of the research findings.

Monika

  • LLM Expertise: Shared her expertise and experience with Large Language Models (LLMs) and artificial intelligence (AI).
  • Learning and Exploration: Mentioned that she has been learning about LLMs, AI, and related technologies, including creating embeddings and implementing pipelines.
  • Current Projects: Currently exploring how to create embeddings, implement retrieval-augmented generation (RAG) pipelines, and address questions using web scraping and specific file inputs to produce targeted outputs.
  • Moderation Offer: Expressed her willingness to help moderate discussions on LLMs and their application to phenotyping.
  • Scope and Expectations: Wanted to understand the scope and expectations before committing to moderating discussions.
t