Phenotype Phebruary 2025 Office Hours Phebruary 5th 2025
Topics discussed:
1. File Management & Team Coordination
- Centralized Repository in Phenotype Workgroup Folder
- Future Storage in GitHub
- Use of Teams Channels & Folder Organization
- Practical Tips for Document Sharing & Searching
2. Searching & Reusing Existing Phenotypes
- OHDSI Phenotype Library & CSV Lookup
- Shiny App for Cohort Diagnostics
- VA’s Cipher Library
- Other Resources: Darwin, Sentinel, Forum Posts
3. Phenotype Development & Validation Approaches
- Pass/Fail Rubrics & Decision Making
- Tools: Cohort Diagnostics, PheEvaluator, Keeper
- Balancing Qualitative & Quantitative Measures
- Importance of Iterative Reviews & Expert Input
4. Discussion on Pediatric Vision Screening Use Case
- Non-Disease Phenotype Strategies
- Role of Comparator Cohorts (Well-Child Visits, Vaccinations)
- Procedure-Based Logic vs. Condition-Based Logic
- Documenting Known Coding Gaps & Uncertainties
5. Discussion on Ulcerative Colitis
- Kevin’s IBD Phenotypes
- Overlaps & Synergy with UC/CD Definitions
- “Enhanced” Treatment Pathways
- Transition from J&J Library to OHDSI Library
Topic 1: File Management & Team Coordination
The group agreed to use the Teams Phenotype Workgroup folder as the centralized location for all Phenotype Phebruary materials, ensuring easy discovery and consistent organization. Ultimately, finalized study cohorts and code will move to GitHub for open collaboration. Each study has a dedicated subfolder, and leads should confirm access rights for all contributors. This setup aims to minimize confusion, promote transparency, and create a smooth hand-off to GitHub once the cohorts are finalized.
-
Centralized Repository in Phenotype Workgroup Folder
- Several participants [Inquiry] asked where to store new or updated study documents.
- Anna [Advocacy] explained that all phenotype-related content for “Phenotype Phebruary” should be placed in the Phenotype Workgroup Files (Teams → Phenotype Workgroup → Documents → General → Phenotype Phebruary 2025 subfolder).
- Rationale: Keeping everything in one place ensures the workgroup can easily review and coordinate, especially during the month-long phenotype push.
-
Future Storage in GitHub
- Anna [Advocacy] noted that while Teams will function as the short-term workspace, GitHub will be the eventual long-term repository for finalized cohorts and study packages.
- The group [Agreement] accepted that each study’s ultimate destination is a dedicated GitHub repository to ensure transparent version control and public access.
-
Use of Teams Channels & Folder Organization
- Chris [Inquiry] highlighted confusion about which Teams channel or folder to use for their specific study materials.
- Anna [Advocacy] clarified that all Phenotype Phebruary content should live in the Phenotype Workgroup’s general channel and be placed under the appropriate subfolder named by week (e.g., Week 1, Week 2).
- Action Item: Each study lead will direct collaborators to the correct Teams subfolder for uploading relevant files.
-
Practical Tips for Document Sharing & Searching
- Christopher M. [Inquiry] raised questions about how to efficiently find existing files and how to share them with the right collaborators.
- Anna [Advocacy] suggested using the “Files” tab in the channel and employing consistent naming conventions for clinical descriptions and spreadsheet logs.
- Information Gap: Some participants lack admin rights to add new members. Resolution: Anna [Advocacy] volunteered to add missing members upon request.
Implicit Assumptions & Information Gaps
- Assumption: Everyone in the workgroup can access Teams’ general channel and subfolders; some participants might still need channel permissions.
- Gap: No automated process to ensure new members are promptly added; relies on manual requests to the channel admins.
Topic 2: Searching & Reusing Existing Phenotypes
The group discussed various resources for locating existing phenotypes, highlighting the OHDSI Phenotype Library (with a supplemental CSV file), a Shiny diagnostics app, VA’s Cipher platform, and external sources like Darwin and Sentinel. While each provides a starting point for defining cohorts, users must verify that available definitions align with their precise research questions. Conversion to Atlas-compatible logic and the need for context-specific adjustments remain crucial steps.
-
OHDSI Phenotype Library & CSV Lookup
- Chris [Inquiry] expressed difficulty searching through the OHDSI Phenotype Library—particularly having to use the CSV file for definitions.
- Anna and Gowtham [Advocacy] acknowledged the suboptimal nature of the current CSV-based lookup.
- Action Item: Use the CSV along with references in the library’s GitHub repository to locate existing cohort definitions.
-
Shiny App for Cohort Diagnostics
- Anna [Advocacy] shared that a Shiny application at data.ohdsi.org can display patient characteristics and basic diagnostics for some phenotypes in the OHDSI Library.
- Gap: The Shiny app is not entirely up to date—some newer or modified cohorts may be missing or deprecated.
- Gowtham [Inquiry/Advocacy] explained that the Shiny app gives partial coverage but still helps in evaluating initial patient counts and characteristics.
-
VA’s Cipher Library
- Gowtham [Advocacy] recommended VA’s Cipher (Centralized Interactive Phenomics Resource) as a more user-friendly front-end to find validated or established OMOP-conformant definitions.
- Implicit Assumption: Cipher includes machine-learning-based phenotypes as well as “rubric-based” definitions (i.e., clear rule-based logic). Study leads must verify they are using rule-based cohorts for community-wide network studies.
-
Other Resources: Darwin, Sentinel, Forum Posts
- Anna [Advocacy] noted that beyond the official OHDSI Library, other large-scale initiatives (e.g., Darwin EU, Sentinel) publish or store phenotype definitions.
- Implicit Assumption: Accessing these can be challenging if they are not readily packaged for OMOP. Sometimes only code lists (no Atlas JSON) are available.
- Recommended Approach: Identify relevant code sets, then manually adapt them into Atlas cohorts or JSON specification for usage in OHDSI studies.
Implicit Assumptions & Information Gaps
- Assumption: Contributors know how to adapt external code lists into Atlas format for use in the OHDSI ecosystem.
- Gap: Many validated definitions exist in scattered resources, requiring manual curation or reformatting.
Topic 3: Phenotype Development & Validation Approaches
The workgroup discussed the importance of combining qualitative (expert judgment) and quantitative (cohort diagnostics) approaches to phenotype validation. While no single pass/fail threshold exists, guidelines and tools—Cohort Diagnostics, PheEvaluator, Keeper—can help users systematically refine definitions. Ultimately, iterative reviews that incorporate stakeholder input are critical to build robust phenotypes suited for multi-site network analyses.
-
Pass/Fail Rubrics & Decision Making
- Andrew [Inquiry] asked about a systematic way to judge “pass/fail” phenotypes or data quality.
- Gowtham [Advocacy] clarified that while there’s no single objective threshold, qualitative and quantitative criteria from cohort diagnostics guide “fitness for use.”
- Anna [Advocacy] and Gowtham [Advocacy] referenced prior research on creating a workflow rather than a strict numeric threshold, recommending iterative review and stakeholder consensus.
-
Tools: Cohort Diagnostics, PheEvaluator, Keeper
- Anna & Gowtham [Advocacy] described multiple validation tools already developed in the OHDSI ecosystem:
- Cohort Diagnostics – Provides descriptive statistics and potentially flags data quality issues.
- PheEvaluator – Evaluates cohorts using chart review logic or external references for “ground truth.”
- Keeper – Offers (in some contexts) patient-level review with potential GenAI enhancements.
- Information Gap: Not all tools are equally mature; the group must tailor usage to each study’s data availability and needs.
- Anna & Gowtham [Advocacy] described multiple validation tools already developed in the OHDSI ecosystem:
-
Balancing Qualitative & Quantitative Measures
- Andrew [Advocacy] emphasized a structured, qualitative approach for consistent decision-making.
- Gowtham [Advocacy] explained that human judgment (clinical expertise) combines with data-driven insights (prevalence, incidence, code frequencies) for final determination.
- Action Item: Follow published frameworks (e.g., the “phenotype evaluation” paper) as a reference to unify qualitative impressions with objective metrics.
-
Importance of Iterative Reviews & Expert Input
- Anna [Advocacy] underscored that peer feedback—including clinicians and analysts—often improves definitions.
- Assumption: Each study lead will incorporate repeated reviews of cohort diagnostics output, revising inclusion/exclusion logic until confident in the final design.
Implicit Assumptions & Information Gaps
- Assumption: Most participants have at least some familiarity with tools like Cohort Diagnostics, or can attend training sessions/office hours if needed.
- Gap: Access to all advanced validation approaches (e.g., chart review data) varies by site.
Topic 4: Discussion on Pediatric Vision Screening Use Case
The group examined how to characterize pediatric vision screening within OMOP. Rather than a disease-centric approach, leaders proposed defining procedures, typical age criteria, and relevant codes. Comparators such as well-child visits or vaccination records can help gauge screening uptake. However, data capture inconsistencies—where only a fraction of screening procedures appear in OMOP—must be documented and addressed during analysis.
-
Non-Disease Phenotype Strategies
- Michelle [Inquiry] posed questions on how to handle a phenotype that is not strictly a disease—specifically, pediatric vision screening.
- Anna [Advocacy] acknowledged this use case differs from typical condition-based phenotypes, urging Michelle to think about procedures and clinical events rather than diagnoses.
- Implicit Assumption: The process for describing a “non-disease” phenomenon still requires stating clear, clinically relevant rules (e.g., age brackets, procedure codes) in a clinical description.
-
Role of Comparator Cohorts (Well-Child Visits, Vaccinations)
- Michelle [Inquiry] raised the idea of comparator cohorts—e.g., children who had at least one well-child visit or vaccination—to assess screening rates by comparison.
- Anna [Advocacy] supported including these comparator definitions, explaining they might serve as proxies for overall healthcare engagement.
- Action Item: Identify suitable codes for well-child visits or vaccinations and define inclusion/exclusion logic for a comparator cohort.
-
Procedure-Based Logic vs. Condition-Based Logic
- The discussion clarified that pediatric vision screening often appears in billing data as a procedure code rather than a diagnosis code.
- Anna [Advocacy] suggested systematically outlining the hallmarks of these screenings (e.g., device usage, frequency, typical patient age).
- Gap: Not all screening methods have standardized or reliably used procedure codes, so some local data may only be partially mapped to OMOP.
-
Documenting Known Coding Gaps & Uncertainties
- Michelle [Inquiry] noted a mismatch between local data (where screening codes appear inconsistently) and the corresponding OMOP procedures that capture only ~10% of expected events.
- Anna & Gowtham [Advocacy] recommended explicitly documenting these data limitations in the clinical description, clarifying possible underestimation in large-scale network analyses.
- Assumption: Iterative refinement and testing via cohort diagnostics will help gauge the magnitude of missing codes.
Implicit Assumptions & Information Gaps
- Assumption: There are existing CPT4 or other billing codes for vision screening in pediatric populations; however, usage may vary by site.
- Gap: Unclear how best to reconcile partial capture of procedure codes in the data, especially when local EHR data differ from standard claims databases.
Topic 5: Discussion on Ulcerative Colitis
Kevin’s Ulcerative Colitis and Crohn’s Disease phenotypes are nearly ready for the OHDSI Library, having originated in J&J’s internal library. The group underscored the importance of harmonizing existing definitions, ensuring they align with network standards. Kevin is refining an “enhanced treatment pathway” approach that captures drug switching and durations, moving beyond traditional summary visualizations.
-
Kevin’s IBD Phenotypes
- Kevin [Advocacy] reported that most of his Inflammatory Bowel Disease (IBD) phenotypes have been developed and are ready to be transferred from the J&J Library to the OHDSI Library.
- Action Item: Ensure a clean handoff of final cohort definitions, referencing IDs in the J&J Atlas environment.
-
Overlaps & Synergy with UC/CD Definitions
- Kevin [Advocacy] noted he has expanded from simply Ulcerative Colitis (UC) to include Crohn’s Disease (CD) definitions.
- The group [Agreement] acknowledged potential overlaps in treatments and code sets for UC and CD, emphasizing consistent definitions to support cross-comparisons.
- Information Gap: Some IBD phenotypes already exist in the OHDSI Library; Kevin plans to reconcile any redundant or conflicting definitions.
-
“Enhanced” Treatment Pathways
- Kevin [Advocacy] distinguished his approach from standard “pathway” methods. He is interested in capturing switching behaviors, durations, and sequence of therapies, beyond the simple donut plots in typical Atlas treatment pathways.
- Implicit Assumption: Atlas’s built-in treatment pathways may require customization or post-processing to fully capture medication timelines.
-
Transition from J&J Library to OHDSI Library
- Kevin [Advocacy] and Anna [Inquiry/Advocacy] discussed that while Kevin’s cohorts are mostly finalized in the J&J internal library, they still need official OHDSI Library IDs.
- Action Item: Kevin will list the relevant J&J definitions and submit them for inclusion in the OHDSI Library, or confirm they are already present.
Implicit Assumptions & Information Gaps
- Assumption: Treatment duration and switch metrics can be accurately extracted from observational data. Actual usage patterns may vary by site.
- Gap: Additional clarity is needed on how to handle combination therapies, dose changes, and overlapping treatments in the enhanced pathways.