OHDSI Home | Forums | Wiki | Github

OHDSI Phenotype Phebruary 2024 and workgroup updates

OHDSI Phenotype Development and Evaluation Workgroup Meeting Summary

Notes synthesis by GPT

Date: February 23, 2024

The OHDSI (Observational Health Data Sciences and Informatics) Phenotype Development and Evaluation Workgroup convened its second meeting of the month on February 23, 2024. This session was part of the ongoing efforts in the OHDSI community for the Phenotype Phebruary initiative of 2024, led by @Azza_Shoaibi , @aostropolets , and @jweave17 , with support from various members of the OHDSI community.

Purpose of the Meeting:

The primary objective of this meeting was to review the progress made in Phenotype Phebruary, focusing on the development and evaluation of phenotypes across different conditions. The team aimed to catch up on the current status of the project and discuss the necessary actions to fulfill the responsibilities taken up by the participants.

Focus of the Phenotype Phebruary 2024:

  • Conditions Studied: The workgroup has been working on four different medical conditions: Alzheimer’s Disease, Lung Cancer (both non-small and small cell types), Major Depressive Disorder (MDD), and Pulmonary Hypertension (pH).
  • Current Phase: The current week’s focus was on Major Depressive Disorder (MDD). The group had previously finalized work on Alzheimer’s Disease and Lung Cancer, making significant learnings in both areas.
  • Approach and Methodology:
    • Literature Review: Initially, the work involved reviewing literature and abstracting information from studies conducted in the last four years.
    • Replication of Cohorts: Following the literature review, the team identified cohort definitions presented in the studies for replication.
    • Analysis of Differences: The goal was to understand the differences in incidence rates and patient characteristics among these cohort definitions and determine the extent of variance after adjusting for measurement errors.

Current Status of Major Depressive Disorder (MDD) Study:

  • Replication Phase: The team is at the peak of replicating 20 different cohorts for MDD, as shared by @aostropolets in the literature abstraction form. Individuals have signed up for the replication tasks and for quality control (QC) purposes.
  • Progress Update: Most of the cohorts for replication have been assigned, and participants have been actively working on completing these tasks.

The meeting was set to provide a high-level summary of the progress and discuss the next steps in the Phenotype Phebruary initiative.


Summary of the Case Brought by @SEPTI_MELISA_TMU

During the OHDSI Phenotype Development and Evaluation Workgroup meeting, Septi Melisa presented a case that highlighted a challenge in phenotype replication. The key aspects of the case were:

  1. Source Code Issue: Septi Melisa discovered a source code that was not part of the ICD-10-CM (International Classification of Diseases, Tenth Revision, Clinical Modification). She sought guidance on whether to include this code in their analysis, especially considering its potential impact on the record count.

  2. Analysis of the Code: The discussion revealed that the code in question was ICD-10 (not ICD-10-CM) and was identified as F20.4. This discrepancy suggested that the original paper, which the replication was based on, might have used non-U.S. data sources, as ICD-10-CM does not include this particular code.

  3. Decision on Inclusion: Anna Ostropolets, a participant in the meeting, advised that considering the nature of the study and the international versions of ICD-10 (such as those used in China, Korea, Germany, and France), it was reasonable to assume that the researchers intended to include the code F20.4, which maps to both schizophrenic depression in SNOMED (Systematized Nomenclature of Medicine) and other ICD-10 versions. Therefore, it was concluded that this code should be included in the concept set for the study.

This case underscores the complexities involved in phenotype replication, especially when dealing with international coding systems and variations in their usage across different countries.


Summary of Key Ideas Around Concept Set Expressions, Codes, and Clinical Description in OHDSI Phenotype Development

  1. Defining Clinical Ideas Upfront:

    • The meeting emphasized the importance of defining clinical ideas upfront in research papers. Often, papers fail to provide a clear clinical description of the phenotype they are studying. Instead, they use general terms like “depression” without specifying the exact nature of the condition they intend to study.
    • This lack of clarity can lead to challenges in phenotype replication and interpretation. For example, the term “depression” can encompass a broad range of conditions, from major depressive disorder to depressive symptoms in other conditions. Papers should ideally include a sentence or two clearly stating the clinical idea they are investigating and its context within the study.
  2. Context-Dependent Interpretation of Phenotypes:

    • The interpretation and replication of phenotypes are highly context-dependent. It’s crucial to understand the specific condition being focused on and not to generalize or “cherry-pick” codes. The selection of codes should be aligned with the intended clinical idea of the study.
    • For instance, in studies of depression, including a wide range of related codes without a clear rationale can muddy the waters. This includes codes for conditions that might only marginally relate to the primary clinical idea being investigated. A more targeted approach, guided by a clear clinical description, can help in developing more accurate and specific phenotype definitions.
  3. The Need for Intentional Definitions:

    • The discussions highlighted the need for intentional and well-articulated definitions of phenotypes in research. These definitions should explain the purpose and context of the phenotype, making it easier for others to understand and replicate the study. The idea is to provide a clinical description in sufficient detail, outlining the attributes of the phenotype being modeled.

The key takeaway is the emphasis on the clarity and specificity of clinical descriptions in phenotype research. This clarity aids in accurate phenotype replication and interpretation, ensuring that the research is focused, relevant, and easily understandable by others in the field.


Cross posting, original post from @Azza_Shoaibi on OHDSI MS Teams here

PAH Cohort replication
LR abstraction for replication.xlsx

As we are done with data extraction (thank you to everybody who contributed!!), we are moving to replicating the cohort/phenotype definitions in Atlas. We will use the public instance of Atlas (atlas-demo.ohdsi.org).

If you are familiar with Atlas and would like to contribute, please put your name next to as many/as few phenotypes as you want on the sheet attached avobe (Files-> Phenotype Phebruary 2024 → PAH (Week 4) → 2.Cohort replication). You can sign up to replicate a cohort or to QC somebody else’s cohort.

If you are not familiar with Atlas but would like to build cohorts together, reply to this message and we will set up a call for a group to build cohorts together.
LR abstraction for replication.xlsx

1 Like

Hi @Gowtham_Rao , please include me in the cohort build process!

Notes from Phenotype Phebruary 2024: Pulmonary Hypertension Cohort Diagnostics Review

Synthesized using GPT

@Azza_Shoaibi provided a recap of the Phenotype February status regarding Pulmonary Arterial Hypertension (PAH) and other topics. Here’s a summarized recap in bullet points:

  • Phenotype February Purpose: Highlighted the primary objective of Phenotype February, focusing on evaluating and improving the consistency and accuracy of phenotype definitions across various health conditions, including Major Depressive Disorder (MDD) and PAH.

  • Findings on Major Depressive Disorder (MDD):

    • Discussed inconsistencies found in phenotype definitions for MDD.
    • Mentioned the replication of 28 cohort definitions and the use of cohort diagnostics tools to analyze these definitions.
    • Emphasized the importance of checking cohort counts and incidence rates across different data sources to identify potential issues in the cohort definitions.
  • Focus on Pulmonary Arterial Hypertension (PAH):

    • Conducted a literature review from 2008 to 2018 to identify phenotype algorithms for PAH in claims data, resulting in the identification of 18 algorithms.
    • Highlighted the variations in codes and medication use in PAH definitions, underscoring the impact of definition choices on research outcomes.
    • Stressed the goal of building upon existing work to fill gaps and improve PAH phenotype algorithms, considering different data sources like registries, EHRs, and claims data.
  • Impact of Phenotyping on Research Estimates:

    • Illustrated how different phenotype definitions could significantly affect incidence rates and study outcomes, using Major Depressive Disorder (MDD) as an example.
    • Emphasized the potential for definition choice to impact estimates by significant factors, highlighting the critical role of precise phenotyping in health outcomes research.

Regarding Major Depressive Disorder (MDD):

  • Inconsistencies in Phenotype Definitions: Highlighted that the previous week’s focus on Major Depressive Disorder (MDD) revealed inconsistencies in phenotype definitions, underscoring the need for thorough evaluation and standardization of these definitions to ensure consistency across studies.

  • Cohort Definitions and Diagnostics: Mentioned the replication of 28 cohort definitions for MDD and the utilization of cohort diagnostics tools to evaluate these definitions. This process involved examining cohort counts across different data sources to identify any definitions that consistently resulted in zero counts, which could indicate issues with the cohort definition that need correction.

  • Incidence Rate Variability: Discussed the variability in incidence rates for MDD across different data sources and cohort definitions. This included a significant range in estimated incidence rates, from as low as 1.9 per 1000 person-years based on one study to as high as 40.21 per 1000 person-years in another study. This variability highlighted the substantial impact that phenotype definition choices can have on research outcomes.

  • The Importance of Definition Choice: Emphasized that the choice of phenotype definition could dramatically affect study estimates, potentially by a factor of 40x in the context of MDD. This underscores the critical role of careful phenotype definition in health outcomes research, comparative safety studies, and quality assessments.


Regarding Pulmonary Arterial Hypertension (PAH), @Azza_Shoaibi discussed the following points:

  • Literature Review on PAH Phenotype Algorithms: Mentioned a comprehensive literature review conducted to identify phenotype algorithms for PAH from 2008 to 2018. This review focused on algorithms used in claims data, resulting in the identification of 18 different algorithms for PAH.

  • Summary of Identified Algorithms: Described how the identified algorithms were summarized, including the codes used and any other criteria involved in their definitions. This effort aimed to consolidate existing knowledge on PAH phenotyping approaches to inform future work.

  • Consistency with Alzheimer’s Findings: She noted that the variations observed in the PAH algorithms, particularly regarding codes and medication use in the definitions, were consistent with findings from Alzheimer’s disease research. This underscored the broader issue of variability in phenotype definitions across different conditions.

  • Goal to Build Upon and Fill Gaps: Azza Shoaibi stated the intention to build on the existing body of work by filling gaps identified in the literature review. The goal was to refine PAH phenotype algorithms by considering the nuances of different data sources, such as registries, Electronic Health Records (EHRs), and claims data, to improve the selection and identification of PAH patients.

  • Impact of Definitions on Research Estimates: Emphasized that, similar to findings in Major Depressive Disorder (MDD), the choice of phenotype definition for PAH could significantly affect research outcomes. The inconsistency in definitions leads to variability in incidence rates and study findings, highlighting the importance of precise and standardized phenotype definitions in research.


Several individuals participated in the discussion, providing key insights and comments on the topics of Major Depressive Disorder (MDD) and Pulmonary Arterial Hypertension (PAH). Here are the participants and summaries of their contributions:

Roham Zamanian

  • Questions on Algorithms: Asked about the evaluation of multiple algorithms from the literature in the context of PAH and the potential impact of natural language processing on the identification of pulmonary hypertension.
  • Comment on Incidence Rates and Drug Use: Noted that the choice of algorithm for PAH depends on the research question and that the specificity or sensitivity desired might vary based on the study’s requirements. Also mentioned the role of PAH-specific therapeutics and the challenge of distinguishing between doses for pulmonary hypertension versus other conditions, like erectile dysfunction.

Jason

  • Clarification on Incidence Proportions: Inquired about the incidence proportions being 100% across the board in one of the data visualizations and provided insights on idiopathic PAH, suggesting that overall PAH might range from 10 to 20 per million per year.

@Evan_Minty

  • Discussion on Data Interpretation: Shared observations on how specific definitions, such as requiring two PAH therapies, show a gradual increase in incidence over time, which could reflect trends seen in registry studies. Emphasized the importance of considering the epidemiology of right heart failure in relation to pulmonary hypertension and the potential to explore further in OHDSI’s network.

@jweave17

  • Heuristic for Understanding Phenotyping Errors: Proposed a heuristic approach to understanding sensitivity and specificity errors in phenotyping, emphasizing the importance of considering patterns of care and changes over time when developing phenotype definitions.

@Kevin_Haynes

  • Caution on Overestimating Data Sources: Warned against overestimating the completeness of claims data, particularly for medications like epoprostenol, which are not well captured in claims databases due to their administration settings. Highlighted the need to temper expectations regarding the completeness of the available data.

OHDSI Phenotype Development and Evaluation Workgroup meeting March 8th 2024:

Here are the revised and structured meeting notes based on the content provided from the text transcript:

Phenotype February 2024 - Next Steps and Reflections

  • Main Tasks Identified by Anna and Azza:

    1. Publication of Main Findings: Aim to publish the main findings from each condition examined during Phenotype February 2024. This involves summarizing the outcomes in forum posts to make the findings accessible to the public.
    2. AMIA Manuscript Preparation: A manuscript draft targeted for AMIA submission is underway, necessitating completion within the next week. The current draft includes introductory and methodological sections. However, a comprehensive summary of results, particularly focusing on incidence rate variations and their implications, is pending.
  • Insights on Phenotype Definitions and Measurement Error:

    • Weaver emphasized the necessity of using phevaluator-generated point estimates judiciously, acknowledging the inherent uncertainty in phenotype definition precision. A substantial amount of data generated from Phenotype February will aid in refining these estimates.
    • Azza highlighted the distinct focus of the AMIA submission, which will cover variation in incidence rates, patient overlap, and distribution based on phevaluator estimates. They also discussed infrastructure considerations, notably the importance of maintaining separate outputs for different diseases to avoid confusion.
  • Call for Community Involvement:

    • Azza sought volunteers from the community to lead papers focusing on specific clinical areas, leading to Ben Hamlin volunteering to spearhead a paper on Major Depressive Disorder (MDD). This initiative aims to delve deeper into the clinical insights and methodologies developed during Phenotype February.

Overall Reflections on Phenotype February 2024

  • Joel commended the strategic pivot from merely creating phenotypes to a more analytical and reflective approach to phenotype definition and evaluation.
  • Azza and Thamir shared observations on the variability in code selection across different vocabularies (e.g., ICD-9 vs. ICD-10) and the potential implications for study accuracy and consistency. They noted an apparent disconnect between clinical practice and informatics, underscoring the need for better collaboration and knowledge exchange.
  • Discussion highlighted the impact of vocabulary drift and external factors (e.g., quality initiatives) on data documentation and the importance of incorporating these variables into phenotype development and evaluation.

Objectives and Key Results (OKR) for 2024

  • The meeting included a discussion and ratification of the OKR for 2024, emphasizing continued progress and collaboration within the OHDSI community.

Initiative by Chris Mecoli

  • Chris introduced an initiative focusing on autoimmune diseases, starting with myositis. Leveraging the OHDSI tools and network, the goal is to conduct research that bridges the gap between biomedical informatics and clinical practice. The initiative aims to publish findings in clinical journals to raise awareness of OHDSI’s capabilities in the autoimmune disease space.

Concluding Remarks

The meeting underscored the importance of comprehensive evaluation and publication of findings, the necessity of collaboration across disciplines, and the potential of OHDSI tools to contribute significantly to clinical research. The Phenotype February initiative, alongside the ongoing projects discussed, represents a concerted effort to advance both methodological rigor and clinical relevance in health research.


Based on the revised meeting notes, here is a checklist of tasks and objectives identified during the discussion for Phenotype February 2024 and related initiatives:

For Phenotype February 2024

  • [ ] Publish Main Findings:

    • [ ] Summarize the main findings from each condition studied.
    • [ ] Create and post forum entries to disseminate the findings publicly.
  • [ ] Complete AMIA Manuscript Draft:

    • [ ] Finalize the summary of results, focusing on incidence rate variations and other key outcomes.
    • [ ] Ensure the draft includes a comprehensive introduction and methods section.
    • [ ] Address the missing paragraph in the introduction that reviews prior papers on the topic.
  • [ ] Community Engagement and Collaboration:

    • [ ] Make an official forum post inviting collaboration on the AMIA manuscript and other related projects.
    • [ ] Encourage community members to contribute to the draft located in the publication folder.
  • [ ] Infrastructure and Data Analysis:

    • [ ] Decide on maintaining separate outputs for different diseases to ensure clarity and accuracy in analysis.
  • [ ] Volunteer Coordination:

    • [ ] Ben Hamlin to lead the paper focusing on Major Depressive Disorder (MDD).
    • [ ] Determine and communicate a timeline for the MDD-focused paper.

For OKR 2024

  • [ ] Ratify and Implement OKR for 2024:
    • [ ] Confirm the objectives and key results outlined for the year.
    • [ ] Begin executing tasks to achieve these objectives.

Chris Mecoli’s Initiative on Autoimmune Diseases

  • [ ] Research on Myositis:

    • [ ] Complete phenotype development and evaluation for myositis.
    • [ ] Conduct a network study by running developed phenotypes across different CDMs in the OHDSI network.
  • [ ] Utilize OHDSI Tools:

    • [ ] Apply OHDSI tools like phevaluator and cohort diagnostics to analyze myositis phenotypes.
  • [ ] Publish Research Findings:

    • [ ] Draft and submit findings for publication in clinical journals, highlighting the collaboration between biomedical informatics and clinical practice.
    • [ ] Raise awareness of OHDSI and its tools within the autoimmune disease research community.

General Tasks

  • [ ] Enhance Collaboration Between Clinics and Informatics:

    • [ ] Identify opportunities for increased interaction and knowledge exchange between clinicians and the biomedical informatics community.
  • [ ] Reflect on Phenotype February 2024:

    • [ ] Gather feedback and reflections on the initiative, focusing on literature review insights, code selection, and the impact of vocabulary drift.
  • [ ] Consider Expanding Research Focus:

    • [ ] Explore the possibility of extending research to other autoimmune diseases beyond myositis, applying the lessons learned and methodologies developed.

This checklist is designed to guide the teams and individuals involved in the Phenotype February 2024 initiative and related projects towards completing their identified tasks and objectives.

Friends:

As a part of the Phenotype Phebruary we are making a paper submission to the AMIA Annual Symposium! We’ve had a lot of great work from our collaborators. If you contributed to the work and would like to be a co-author, please review the paper here (also available by navigating through Teams as below), fill the authorship form at the top with your info and contribution and provide your comments by Monday March 18th noon EST so that we can submit it to AMIA by Mon EOD.

And, of course, thank you for your contribution - it has been a wonderful collaboration! :slight_smile:

1 Like

March 22nd meeting

After incredible busy February, the this work group is taking a break for one week. Happy spring break everyone. Looking forward to collaborating.

Meeting Minutes - OHDSI Work Group Call

autogenerated using GPT

Date: April 12, 2024
Time: Starting at 00:00
Location: Virtual (MS Teams)
Attendees: Gowtham Rao, Jamie Weaver, Azza Shoaibi, Ben Hamlin, Anna Ostropolets, and others


Updates and Discussions

1. Objective: Enhance the Science of Phenotyping and Best Practices

  • Publication of OHDSI Phenotype Library Paper (Q2 2024)
    • Lead: Gowtham Rao, Juan Banda
    • Status: Draft in advanced stage, targeting completion in Q2 2024. Currently referenced by other studies and previously presented at AMIA 2023.

2. Network Study on Measurement Error Incorporation

  • Lead: Jamie Weaver
  • Objective: Incorporate measurement error in the interpretation/correction of background incidence rate estimates by Q4 2024.
  • Comment: Discussion on age and colorectal cancer cohort stratification highlighted. Potential collaboration between Ben and Jamie discussed.

3. Colorectal Cancer Screening Phenotype

  • Lead: Ben Hamlin
  • Updates on age-stratified, race-stratified, and ethnicity-stratified screening research.
  • Mentioned as first quality measure in the phenotype library, under review by NCQA.

4. Discussion on Source Errors

  • Presented by Hayden Spence and Jamie Weaver.
  • Analysis of errors attributable to different data sources, with particular attention to population composition differences and coding intents.

5. Phenotype Phebruary 2024

  • Lead: Azza Shoaibi
  • Discussion on finalizing the activities and publications resulting from the event.
  • Mention of a draft manuscript that is open for feedback within the OHDSI community.

Action Items

  • Paper Drafts and Reviews:

    • Encourage collaboration and feedback on various drafts, particularly the OHDSI Phenotype Library paper.
    • Set up discussion between Ben and Jamie to further delve into age stratification in the colorectal cancer cohort.
  • Phenotype Phebruary 2024 Outcomes:

    • Publish main findings and prepare a manuscript for submission.
    • Forum post planned to invite wider community collaboration.
  • Next Meeting:

    • Scheduled for April, fourth week. Main agenda item to focus on library paper completion.

Next Steps:

  • Continue to solicit and incorporate feedback on the OHDSI Phenotype Library paper.
  • Advance discussion on incorporating measurement error in study designs.
  • Prepare for upcoming presentations and manuscript submissions.

Closing Remarks:

  • Gowtham Rao thanked participants for their contributions and outlined the preparation for the next meeting, emphasizing ongoing collaboration and community engagement.

End of Call: 00:47:55

During the meeting, Ben Hamlin discussed his involvement in the development of a colorectal cancer screening phenotype aimed for submission to the OHDSI Phenotype Library. He mentioned the phenotype is age-stratified, race-stratified, and ethnicity-stratified, with an initial intent to do screening disparities research.

Ben Hamlin highlighted that this phenotype is the first quality measure to be reviewed by the National Committee for Quality Assurance (NCQA). He expressed interest in potential collaborations to brainstorm and discuss further about age stratification and other dimensions such as sex and payer stratification, which he has previously utilized in a study related to clinical bias and disparities. This approach uses the Atlas tool for HEDIS (Healthcare Effectiveness Data and Information Set), indicating a comprehensive method to address clinical variations and potential errors in healthcare data.

Hayden Spence:

  • Raised a query about how much error is attributable to different data sources.
  • Emphasized the importance of understanding the variations in errors across different population compositions, data source content, and coding intents.

Jamie Weaver:

  • Acknowledged that there are multiple sources of errors which need to be addressed in studies, especially emphasizing the diversity in population composition and differences in data source coding practices.
  • Highlighted the importance of incorporating measurement errors into the interpretation and correction processes to enhance the accuracy of study results.
  • Discussed the methodologies of incorporating these errors, mentioning the potential impact of these errors on the results and the importance of standardizing data corrections, such as by age and sex, to reduce heterogeneity.

Action items

  • Finalize and publish the OHDSI Phenotype Library paper:

    • Collaborate with Juan Banda to finalize the draft.
    • Gather and incorporate feedback from collaborators.
  • Complete a network study on measurement error:

    • Lead Jamie
    • Discuss with Ben about age stratification.
    • Incorporate measurement error into background incidence rate estimates by Q4 2024.
  • Phenotype Phebruary 2024 wrap-up:

    • Publish main findings from the event.
    • Draft and submit the manuscript based on findings.
    • Post on forums to invite broader community collaboration.
  • Review and feedback on paper drafts:

    • Solicit feedback on various drafts, particularly focusing on the Phenotype Library paper.
    • Review and finalize the draft for submission by the end of Q2 2024.
  • Prepare for upcoming meeting in April (fourth week):

    • Agenda focused on the completion and review of the library paper.
    • Ensure all stakeholders are prepared for discussion and feedback.

Meeting Minutes Summary

Date: 4th Friday (April 26th 2024)

1. Enhance the Science of Phenotyping and Best Practices

  • Paper on OHDSI Phenotype Library: Targeting a pre-print by November 2024 to aid dissemination at AMIA 2024. Responsible: Gowtham Rao and Juan Banda.
  • Network Study on Measurement Error: Discussion on incorporating measurement error in studies, led by Jamie Weaver. A collaboration with Ben Hamlin on colorectal cancer screening and heterogeneity considerations is planned. No update this meeting.

2. Community Engagement and Educational Outreach

  • Phenotype Phebruary 2024: achieved.
  • Atlas Demo Training: achieved.
  • New Submission Flow: no new update

3. Maintenance and Development of Tools

  • Phenotype Library Updates: no activity.
  • Submission Tool for Community Contributions: no activity.
  • Objective Failure Criteria for Cohort Definitions: no activity.

4. Collaboration and Contribution

  • OHDSI Symposium: Need to discuss on contributions and workgroup meeting planning for the upcoming OHDSI 2024 Global Symposium.
  • VA Collaboration: Plans to make the OHDSI Phenotype Library accessible for VA researchers. Updates expected in upcoming May meeting.

5. Action Items

  • Phenotype Library Paper: Focus time required from Gowtham Rao and Juan Banda to make progress.
  • Community Contribution Mechanism: Need to establish efficient and regular contributions to the OHDSI Phenotype Library, particularly from ongoing projects like those mentioned by Azza Shoaibi.

Discussions Related to Darwin

Updates on Darwin’s phenotype Work and Contributions

  • Sharing and Capturing Work: Darwin has been actively working on phenotyping. There was prior discussion on capturing and integrate Darwin’s phenotype work into broader OHDSI communities Phenotype Library.
  • Collection from Study Repositories: There’s an idea to collect and integrate phenotype work from Darwin’s study repositories.
  • OHDSI 2024 Global Symposium Tutorial: A tutorial by Darwin is scheduled for the OHDSI 2024 Global Symposium. This event is anticipated to focus on Darwin’s team development, particularly in terms of R packages and tools, providing an educational opportunity for community members to learn directly from Darwin’s experiences and methodologies.
  • Focus on Phenotype Curation: The tutorial will likely address aspects of phenotype curation/Darwins internal phenotype library.
  • Reference in Paper: Darwin’s methodologies and contributions have been referenced in a recent paper, which discusses the process of curation.
  • Strategy for Library Contributions: There is a dialogue about how best to ensure that Darwin’s contributions are consistently integrated into the OHDSI Phenotype Library. The complexity of their governance may pose challenges, but there is a clear intention to streamline this process to avoid missing out on valuable contributions.
  • Tutorial as a Touchpoint: The upcoming symposium tutorial is seen as a critical touchpoint for further discussions on integrating Darwin’s contributions with broader community efforts. It represents a strategic opportunity to align Darwin’s work with community standards and needs.

These discussions highlight the ongoing integration of Darwin’s work into the OHDSI community, emphasizing the importance of capturing and leveraging their contributions effectively. The tutorial at the symposium will likely serve as a pivotal event to deepen understanding of Darwin’s methodologies and enhance collaborative efforts.


Discussions Related to the Department of Veterans Affairs (VA)

Collaboration on Phenotype Library Access

  • Objective: Enable VA researchers to access and utilize the OHDSI Phenotype Library.
  • Process Development: Plans to create a streamlined process for integrating OHDSI phenotype definitions into VA research workflows.

Future Engagements

  • Upcoming Presentation: The VA is scheduled to present updates on this collaboration at an upcoming meeting in May, focusing on integration progress and experiences.

These discussions highlight the efforts to enhance VA’s research capabilities through access to OHDSI’s resources, aiming for a productive collaboration.


Discussions Related to N3C (National COVID Cohort Collaborative)

Contribution to OHDSI Phenotype Library

  • Previous Discussions: There have been ongoing discussions about N3C contributing their code lists and concept set lists to the OHDSI Phenotype Library. These talks aimed to integrate N3C’s extensive COVID-19 related work into OHDSI’s broader phenotyping efforts.

Action Items and Progress

  • Action Item Follow-Up: N3C agreed to provide a spreadsheet containing items suitable for contribution to the OHDSI library. This step was identified as crucial for moving forward with the integration of N3C’s resources into OHDSI’s framework.

Current Status and Future Actions

  • Need for Follow-Up: There seems to be a need for a follow-up with N3C to confirm the delivery and integration of the spreadsheet into OHDSI’s systems. It’s unclear if N3C has yet submitted this information or if further encouragement is required to expedite the process.

Meeting Agenda 6/14/2024 - 9am EST

  1. Update from @Christopher_Mecoli https://github.com/ohdsi-studies/MyositisNetworkStudy/tree/master/documents and Network Study: Seeking Data Partners in Rheumatology regarding IRB approval, r package (if tested at local site). we will then spend time to review the package
  2. discuss other topics as time permits

Generated using GPT


Update on Dermatomyositis Project Led by Dr. Christopher Mecoli

Today, we focused on updates regarding Dr. Christopher Mecoli’s Dermatomyositis project. Find the latest protocol version here and see the IRB approval status and local testing updates in this forum discussion.

Executive Summary:

  • Study Leadership: Dr. Christopher Mecoli leads the project, which aims to evaluate and validate various Dermatomyositis (DM) phenotypes across multiple OMOP databases.
  • IRB Approval and Local Testing: Awaiting IRB approval, plans are in place for local testing of the R package at participating sites. This will include a review phase to refine the package based on feedback.
  • Data Sources: The study leverages a multinational cohort from the Johns Hopkins Myositis Center and other databases, involving about 1,500 patients.
  • Methodology: The methodology combines manual chart reviews at Johns Hopkins to measure algorithm performance and uses the PheValuator to estimate performance across databases.
  • Technical Discussions: Will Kelly addressed challenges in data integration and emphasized maintaining robust scripts across different database configurations. Cohort definitions are managed using the Web API to ensure data security.
  • Feedback and Iterative Improvements: The study welcomes ongoing feedback to enhance execution, with technical adjustments planned based on collaborative input.
  • Next Steps:
    • Further running of the study script in varied environments to verify adequate sample sizes.
    • Future community work group calls are planned to discuss technical issues and progress, including a presentation by the Department of Veterans Affairs.
    • Continued collaboration and communication among study partners to maintain high standards of data quality and research integrity.

In-Depth Discussion by Dr. Christopher Mecoli:

Background and Objectives:

  • Disease Focus: The study targets Dermatomyositis, a rare autoimmune condition.
  • Research Challenges: DM’s rarity makes robust studies difficult, particularly at single centers.
  • Leveraging the OHDSI Network: Utilizing the OHDSI tools, the study evaluates DM phenotypes to improve research reliability across real-world data sources.

Methods:

  • Data Utilization: Uses data mapped to the OMOP Common Data Model from various sources.
  • Algorithm Evaluation:
    • Johns Hopkins Cohort: Gold-standard manual chart reviews confirm DM diagnoses.
    • Broader Evaluation: PheValuator assesses phenotype performance across multiple databases without direct patient chart access.

Study Execution:

  • Federated Analysis: Sites run analyses locally, sharing only aggregate results, ensuring data confidentiality and governance compliance.

Additional Points from the Technical Discussion:

Will Kelly discussed the technical execution of the study script and the importance of data privacy and appropriate configurations across databases. He also addressed the integration challenges and the use of tools like R and Web APIs in managing and analyzing data, underlining the necessity of adaptability and feedback incorporation into the study’s methodologies.


**Dr. Christopher Mecoli **

Dr. Christopher Mecoli discussed a study on dermatomyositis (DM) in the meeting, emphasizing its rationale, objectives, and methods. Here’s a summary based on his discussion and additional content from the provided document:

Background and Rationale:

  • Disease Focus: The study focuses on dermatomyositis, a rare chronic autoimmune disease affecting muscles and skin, leading to significant morbidity and mortality.
  • Research Challenges: Due to its rarity, studies on DM often lack sufficient power for causal inference, especially those conducted within single centers.
  • Current Limitations: Traditionally, DM algorithms for identifying patients have been limited to single data sources, affecting the generalizability and reproducibility of results.
  • Leveraging OHDSI Network: By using the Observational Health Data Sciences and Informatics (OHDSI) network and its tools, the study aims to evaluate DM phenotypes across multiple real-world data sources to enhance the reliability and applicability of research findings.

Objectives:

  • Primary Objective: To evaluate and validate various DM phenotypes across different OMOP databases.
  • Secondary Objectives: Raise awareness of OHDSI and the OMOP model within the clinical rheumatology community and demonstrate the potential for conducting large-scale network studies on rare diseases like DM.

Methods:

  • Data Sources and Study Design: The study will utilize multinational cohort data mapped to the OMOP Common Data Model, sourced from electronic health records, insurance claims, and registries.
  • Algorithm Evaluation:
    • Johns Hopkins Cohort: Perform gold-standard manual chart reviews on the Johns Hopkins Myositis Center cohort to assess algorithm performance, including sensitivity, specificity, and predictive values.
    • Broader Evaluation: Use PheValuator, a probabilistic tool, to estimate performance across multiple databases, allowing assessment of phenotypes without access to patient charts directly.

Phenotypes of Interest:

  • A variety of DM phenotypes have been developed, utilizing OHDSI’s ATLAS tool. These phenotypes are carefully constructed based on existing literature and refined to ensure accurate representation and diagnosis of DM.

Study Execution and Data Handling:

  • Federated Analysis: Sites will run the study analysis package locally, with only aggregate results shared, ensuring patient data confidentiality and compliance with local governance standards.

Strengths and Limitations:

  • Strengths: Utilizing a network such as OHDSI and the standardized OMOP CDM improves the study’s reach and applicability, making findings more robust and generalizable.
  • Limitations: There are inherent challenges in using EHR data for DM research, including potential inaccuracies in disease onset dates and variability in how data is converted to the OMOP model.

This study is positioned as a significant step forward in using real-world data for researching rare diseases, potentially setting a precedent for future research in rheumatology and other fields.


Dr. Christopher Mecoli described using a variety of data sources for the study on dermatomyositis, specifically focusing on large-scale, real-world data. Here are the specifics mentioned about the data sources:

  1. Johns Hopkins Myositis Center Cohort:
    • Registry Data: This cohort includes data from the Johns Hopkins Myositis Center, which has been systematically collected and managed within a registry.
    • Number of Patients: Approximately 1500 patients are included in this cohort.
    • Data Details: The registry data from 2016 onward has been converted to the Observational Medical Outcomes Partnership (OMOP) common data model. Systematic and detailed chart reviews have been conducted to confirm DM diagnoses, with both symptom onset and diagnosis dates recorded.

In the discussion, Dr. Christopher Mecoli and other participants elaborated on the process of conducting manual chart reviews, particularly focusing on distinguishing true positive and false positive diagnoses of dermatomyositis within their study cohort. Here are the key points from that discussion:

  1. Gold Standard Chart Review:

    • Purpose: The chart review at Johns Hopkins Myositis Center is used as a gold standard to validate the accuracy of dermatomyositis diagnosis and to evaluate the performance of various DM algorithms developed for the study.
    • Details: The chart review encompasses systematic evaluations of clinical records to confirm DM diagnoses based on validated classification criteria (e.g., ACR/EULAR 2017).
  2. True Positives:

    • Definition: Patients who are confirmed to have dermatomyositis based on the chart review and meet the diagnostic criteria accurately.
    • Data Utility: These cases help ascertain the sensitivity and positive predictive value of the diagnostic algorithms being tested.
  3. False Positives:

    • Definition: Individuals who were initially suspected to have dermatomyositis or incorrectly diagnosed but were determined not to have the condition upon further review.
    • Concerns Discussed: The presence of false positives is critical for understanding the specificity and negative predictive value of the algorithms. It’s important for refining the algorithms to reduce misclassification errors.
    • Registry Data: Dr. Mecoli mentioned that they keep track of individuals who were suspected of having dermatomyositis but were confirmed not to have it through further diagnostic processes. This data is essential for assessing the rate of false positives.
  4. Challenges in Reviewing Non-Cases:

    • Difficulty in Manual Adjudication: One of the challenges highlighted was the lack of systematic chart reviews on non-cases, which are essential for a comprehensive evaluation of the specificity and to refute false negatives.
    • Resource Intensity: Conducting manual chart reviews on a large scale, especially involving non-cases, is resource-intensive and often not feasible in many studies.
  5. Implications for Future Research:

    • Enhancing Algorithm Accuracy: By identifying true and false positives, the research team can refine the diagnostic algorithms to improve their accuracy, making them more reliable for broader application across various healthcare datasets.
    • Understanding Diagnostic Challenges: Discussions around true and false positives help illuminate the complexities and nuances in diagnosing dermatomyositis, which can guide future clinical and research strategies.

The discussion underscores the importance of meticulous chart reviews in validating disease algorithms and highlights the challenges and considerations in accurately identifying and documenting disease presence or absence in research studies.


During the meeting, Will Kelly contributed several technical insights and clarifications regarding the study’s implementation and data management. Here are the key points discussed by Will Kelly:

  1. Technical Execution:

    • Script Execution: Kelly discussed the execution of the study script, emphasizing the importance of ensuring that the setup is configured correctly by the sites before running the analysis.
    • Data Handling: He explained how the data should be handled, particularly emphasizing the need for sites to run the analysis locally while ensuring data privacy and governance compliance.
  2. Database Configuration and Issues:

    • Database Diversity: Kelly noted the variety of database configurations across different sites and how they might impact the study’s execution, stressing the need for flexibility in handling diverse data structures.
    • SQL Server Configurations: He mentioned the specific configurations used at Johns Hopkins and how they might differ from other sites, potentially requiring adjustments.
  3. Challenges in Data Integration:

    • Integration of Data: Kelly discussed challenges related to integrating data from different sources, particularly when mapping them to the OMOP common data model. He highlighted the importance of consistency in data handling to ensure the reliability of study results.
  4. Software and Tools:

    • Use of R and Web API: Kelly discussed the use of R programming and web APIs for data analysis and mentioned some of the tools used for managing and executing the study protocol.
    • OHDSI Tools: He elaborated on the use of OHDSI tools, like ATLAS, for creating and managing phenotypes, which are crucial for the study.
  5. Future Considerations and Improvements:

    • Feedback on Script Improvements: Kelly was open to feedback regarding the scripts and methods used in the study, indicating a willingness to make necessary adjustments based on the collaborators’ input to enhance the study’s effectiveness.
  6. Addressing Technical Queries:

    • Clarifications Provided: Throughout the discussion, Kelly responded to technical queries from other participants, providing clarifications on how certain aspects of the study’s technical setup were handled.

During the meeting, there was a detailed and technical discussion among Will Kelly, James Weaver, Joel Swerdel, and Gowtham Rao concerning various aspects of the study’s execution, the tools used, and data management practices. Here’s a breakdown of their conversation:

  1. Discussion on Cohort Definitions and Web API:

    • Weaver raised concerns about cohort definitions and their management using Web API, emphasizing the need to ensure consistency across sites to avoid discrepancies in study results.
    • Kelly responded by discussing the technical handling of cohort definitions, including how they were managed and accessed via the Web API. He acknowledged the need for clarity and control in managing these definitions to prevent accidental modifications that could affect study results.
  2. Use of RDA Files and JSON Objects:

    • Weaver and Joel Swerdel inquired about the use of RDA files and the possibility of using JSON objects instead for more transparent and manageable data handling.
    • Kelly discussed the pros and cons of using serialized R objects versus JSON, explaining the technical reasons for their choices and how they could consider transitioning to more transparent data formats if it proved necessary for the study’s integrity.
  3. Technical Challenges and Solutions:

    • Kelly detailed some of the technical challenges they faced, particularly relating to the script’s execution across different data environments, and how they intended to address these challenges through script modifications and rigorous testing.
    • Gowtham Rao emphasized the need to ensure the study script was robust and reliable across various data settings, urging Kelly to incorporate any necessary changes to improve script performance and reliability.
  4. Security and Data Sharing Concerns:

    • Joel Swerdel discussed concerns related to data security, particularly the mechanisms through which data were shared and the precautions needed to ensure data privacy and compliance with regulations.
    • Kelly reassured the group about the steps taken to ensure data security, such as using secure methods for data transmission and ensuring that all data sharing complied with relevant guidelines and regulations.
  5. Feedback and Future Improvements:

    • Gowtham Rao suggested that the team remain open to feedback on the study’s technical processes and consider any modifications proposed by study collaborators to enhance the study’s effectiveness and efficiency.
    • Kelly was receptive to feedback and expressed a commitment to iterative improvements of the study’s technical infrastructure based on collaborative input and the evolving needs of the study.

Overall, the discussion highlighted the collaborative effort to address technical and operational challenges in the study, ensuring that the data handling and analysis procedures were not only efficient but also secure and compliant with best practices and regulations.


The discussion around next steps in the meeting focused on several action items to ensure the study’s progress and address the technical challenges highlighted during the session. Here are the key next steps that were outlined:

  1. Running the Study Script in Different Environments:

    • Evan Minty mentioned the intention to run the study script at Stanford and other participating sites once more data was gathered on the number of available cases, to ensure sufficient data for meaningful analysis.
    • Gowtham Rao suggested that both Johnson & Johnson (J&J) and Stanford should asynchronously run the script using the public Atlas to verify cohort counts. This would help confirm if there is an adequate sample size to proceed with the planned analyses.
  2. Feedback on Script and Package Adjustments:

    • Will Kelly was tasked with incorporating feedback and making necessary adjustments to the study package to address the technical issues discussed, such as managing cohort definitions securely and ensuring the script’s robustness across different data environments.
    • Weaver, James [JANUS] proposed running the script in J&J’s environment to identify any potential issues, with the goal of sharing findings and recommendations for refining the script.
  3. Review and Update of Technical Processes:

    • Kelly and other technical team members were to review and possibly revise the data handling and analysis processes, especially the integration and management of cohort definitions using Web API and other tools.
  4. Addressing IRB and Ethical Considerations:

    • Christopher Mecoli mentioned that IRB approvals were pending and necessary for proceeding with certain aspects of the study. Updates on the IRB approval process were to be shared in subsequent meetings.
  5. Future Meetings and Discussions:

    • Gowtham Rao suggested scheduling future community work group calls to continue discussions on technical issues and study progress. He mentioned setting aside time in the next meeting for a presentation by the Department of Veterans Affairs as well as continuing to focus on the ongoing study.
  6. Collaboration and Communication:

    • Kelly and Gowtham Rao emphasized the importance of ongoing collaboration and open communication among all study partners to address any emerging challenges and to ensure the study adheres to high standards of data quality and research integrity.

These steps reflect a comprehensive plan to move forward with the study while addressing the complexities of working with real-world data across multiple sites and ensuring the technical infrastructure supports the study’s objectives effectively.


Summary and Agenda for the OHDSI Phenotype Development and Evaluation Workgroup Meeting on 6/28/2024

Introduction

  • Today’s meeting will focus on the VA CIPHER, the Centralized Interactive Phenomics Resource.
  • To learn more about VA CIPHER, please refer to our Cyberseminar: Centralized Interactive Phenomics Resource (CIPHER): Overview and Demonstration of the VA Phenomics Library.

About CIPHER

  • CIPHER improves upon existing phenomics library models to advance innovation in clinical research.
  • The CIPHER phenotype collection standard is an adaptable metadata collection method that enables reproducibility of EHR-based phenotypes.
  • This standard provides both detailed information on the phenotype algorithm and a high-level picture of the development and validation process.
  • Its framework includes standard vocabularies, enabling the interoperability of the phenotype knowledgebase across various healthcare systems.

Agenda

  1. Overview of CIPHER phenotype library (5 min)
  2. Review proposed pilot for integration of OHDSI Phenotype Library definitions in CIPHER (5 min)
  3. Discuss proposed integration (5 min)
  4. Open discussion on the idea (15 min)
  5. Further discussion and closing remarks (15 min)

Note

  • The proposal to copy the content of the OHDSI resource into the VA knowledgebase is straightforward as all the content in the OHDSI Phenotype library is already public and freely accessible. No permission is needed as the resource has a permissive Apache License 2.0.

Generated using GPT

Here’s a detailed summary of the OHDSI Phenotype Development and Evaluation workgroup meeting, correlating the discussion points from the transcript with the corresponding PowerPoint slides from the CIPHER-OHDSI PDF:

Overview and Objectives

  • Slide 2: Objectives
    • Jackie Honerlaw introduced the VA CIPHER program’s intent to integrate the OHDSI Phenotype Library, making it accessible to OMOP researchers at the VA.

Integration and Collaboration

  • Slide 4: CIPHER Overview
    • CIPHER (Centralized Interactive Phenomics Resource) aims to accelerate health data research through a publicly accessible phenotype library.
    • Jackie Honerlaw described the evolution of CIPHER, which includes over 6000 definitions across various disease domains and a growing user base beyond the VA.
  • Discussion/Questions:
    • Azza Shoaibi inquired about the process of accumulating 6000 phenotypes and the validation concerns.

Metadata and Tools

  • Slides 8-12: CIPHER Tools and Metadata
    • CIPHER has developed tools to explore phenotype metadata, such as a searchable database of phenotype articles, data visualization tools, and standardized collection methods for phenotype metadata.
    • Jackie detailed the review and validation process for phenotype submission to ensure clarity and completeness without judging the scientific accuracy.

Pilot and Integration Strategy

  • Slides 15-16: OHDSI-CIPHER Library Integration
    • The proposed approach includes identifying five pilot phenotypes to start integrating OHDSI Phenotype Library definitions into CIPHER, aiming to promote OMOP use and facilitate the search and reuse of OHDSI phenotypes.

Discussion on Integration Details

  • Slide 17: Q&A and Next Steps
    • Extended discussion on the technical aspects of phenotype definition integration, including metadata standards and tools for comparison and visualization of phenotypes.
    • Plans were made to create a subgroup to work on integrating OMOP concepts into CIPHER, with a timeline and specific goals for pilot phenotype integration.

Agreements and Future Directions

  • Slide 18: Future Directions
    • Concluding the meeting, there was consensus on the mutual benefits of integrating the OHDSI Phenotype Library into CIPHER.
    • Discussions highlighted the potential for a formal collaboration and further detailed planning in future sessions.

The focus on Jackie’s comments provided insights into CIPHER’s capabilities, the strategic integration of OHDSI’s resources, and the envisioned collaborative efforts to enhance the utility and accessibility of phenotyping tools across research communities.
2024.06.28_CIPHER-OHDSI.pdf (3.1 MB)

If you would like to join the VA CIPHER-OHDSI integration pilot, please see discussion on MS Teams of OHDSI here
Honerlaw, Jacqueline (Guest): CIPHER-OHDSI Integration Pilot

posted in Workgroup - Phenotype Development and Evaluation / General at Wednesday, July 3, 2024 10:21:47 AM

Hi Phenotype team - this week at the phenotype evaluation workgroup I’d like to go through some of the stuff we’ve been working on for probabilistic phenotyping. Prior efforts have demonstrated the potential for probabilistic phenotyping to more accurately estimate the incidence of drug adverse effects (see Juan Banda et al). The work we’re doing builds on this prior research by creating a standardized process for probabilistic phenotyping using the OHDSI PatientLevelPrediction package. Hope you can join in.

Hi all, For our first CIPHER-OHDSI WG meeting today we will have a demo of CIPHER, walkthrough CIPHER’s phenotype entry form fields, and discuss proposed project timeline for entry of pilot OHDSI phenotypes into the CIPHER library. All are invited to join, please see call-in info on teams. Best, Jackie

Adding screenshot of @jhonerlaw post from yesterday. Thank you @jhonerlaw for leading the OHDSI-CIPHER integeration.

Generated using gpt

Executive Summary of the Phenotype Development and Evaluation Workgroup Meeting

Date: July 12, 2024

Agenda Highlights:

  • Introduction of new ideas on probabilistic phenotyping by Joel Swerdel.
  • Discussion on current projects including Dermatomyositis, VA CIPHER, and the OHDSI Phenotype Library paper.
  • Planning for global symposium activities and soliciting volunteers.

Presentation Overview:
Joel Swerdel presented on the topic of probabilistic phenotyping, comparing the performance of probabilistic and rule-based phenotype algorithms. His presentation, based on a study showcased at ICPE 2024, demonstrated the improved accuracy of probabilistic models over traditional rule-based methods in estimating drug adverse effects.

Member Engagement:
New and returning members introduced themselves, highlighting the international and interdisciplinary makeup of the workgroup, including participants from Brazil and Portugal. This underscores the global impact and collaborative nature of the workgroup.

Conclusion:
The meeting effectively aligned the workgroup’s activities with its strategic objectives, setting a clear path forward for phenotype development and community engagement. The introduction of new analytical methodologies and the active participation of international members indicate robust progress and promising future initiatives.

Summary of Joel Swerdel’s Presentation on Probabilistic Phenotyping

Presentation Overview:
Joel Swerdel presented his new ideas on probabilistic phenotyping during the Phenotype Development and Evaluation Workgroup meeting.

Key Points from Presentation and Discussion:

  1. Introduction to Probabilistic Phenotyping:

    • Joel introduced the concept of probabilistic phenotyping, contrasting it with traditional rule-based phenotype algorithms, which are the standard but often have low sensitivity and positive predictive value for many conditions.
    • Probabilistic phenotyping, based on logistic regression models, offers an alternative that can potentially yield more accurate and reliable outcomes.
  2. Methodological Approach:

    • The methodology involves developing a LASSO regularized regression model for conditions of interest using noisy labels. This approach was detailed in his slides where he discussed building both rule-based and probabilistic phenotypes for angioedema.
    • Joel illustrated the steps involved in creating and evaluating these phenotypes, including the selection of cohorts, application of the model at various time points, and determination of outcomes using designated probability cut-points.
  3. Results and Comparative Analysis:

    • Joel presented the results from his study, which compared the incidence estimates derived from probabilistic phenotyping to those from rule-based methods against randomized clinical trial outcomes.
    • The probabilistic method showed significantly better alignment with the trial results, demonstrating higher sensitivity and positive predictive value. Specifically, probabilistic methods had 77% of their results within the 95% confidence intervals from clinical trials, compared to only 23% for rule-based methods.
  4. Discussion and Application:

    • During the meeting, Joel emphasized how these findings could influence future research and the development of more effective phenotyping algorithms.
    • He mentioned the potential for this methodology to improve the estimation of drug adverse effects, an area of considerable importance in pharmacoepidemiology.

Conclusion:
Joel Swerdel’s presentation effectively introduced and detailed probabilistic phenotyping, showcasing its advantages over traditional methods through rigorous analysis and practical examples. His insights are poised to significantly impact the approach to phenotyping within the OHDSI community, promoting more accurate and reliable research outcomes.


  1. Critiques:

    • There was a concern about the generalizability of the probabilistic phenotyping approach to different conditions and datasets. Some participants questioned how well these methods would perform across diverse healthcare settings and data types, which might not always be as controlled as those in clinical trials.
    • A critique was raised about the complexity of the methodology, particularly regarding the application of LASSO regularized regression and the selection of appropriate probability cut-points. Concerns were about whether less technically skilled researchers could implement these methods effectively.
  2. Clarifications:

    • Participants sought clarification on the handling of noisy labels and the validation of the probabilistic models. Joel responded by discussing the use of noisy labeled positive and negative controls in developing the supervised learning model, which is crucial for ensuring the robustness of the probabilistic phenotype.
    • There were questions about the specific metrics used to evaluate the performance of probabilistic versus rule-based phenotyping. Joel elaborated on the use of sensitivity and positive predictive value as primary metrics, explaining how these were calculated and why they were chosen.
  3. Suggestions:

    • Some attendees suggested that further studies should include a broader range of conditions and potentially incorporate real-world data from various healthcare systems to test the scalability and adaptability of probabilistic phenotyping.
    • A suggestion was made to enhance the presentation of results in future work by including more detailed statistical analysis and possibly graphical representations to better communicate the findings to a wider audience, including those who might not have a deep statistical background.

Summary of Discussion by Jackie about VA CIPHER

  • Project Overview: Jackie provided an update on the VA CIPHER project, which is focused on leveraging veteran’s health data for enhanced research outcomes.
  • Integration with OHDSI: She discussed how the project aligns with the OHDSI Phenotype Library, emphasizing efforts to standardize data handling and phenotyping within the veteran’s health system.
  • Submission Process: Jackie highlighted the development of a new submission process that would facilitate integrating VA CIPHER data into the OHDSI Phenotype Library, aiming to improve data accessibility and utility across research projects.
  • Collaborative Opportunities: She encouraged collaboration, noting that the integration efforts could serve as a model for other large-scale data harmonization projects within OHDSI.

Summary of Discussion on Dermatomyositis

  • Project Update: The discussion briefly touched on the ongoing research and developments related to Dermatomyositis within the OHDSI community.
  • Research Focus: The main focus is on improving the phenotyping algorithms for identifying Dermatomyositis in large healthcare datasets, aiming to enhance the accuracy and reliability of research outcomes.
  • Collaboration Encouraged: Participants were encouraged to contribute to this project by sharing insights, data, and methodologies that could help refine the existing phenotyping approaches.
  • Integration with Tools: There was mention of integrating these efforts with other OHDSI tools and libraries to ensure a cohesive approach to studying Dermatomyositis across different platforms and datasets.

Summary of Discussion on the OHDSI Phenotype Library (PL) Paper

  • Progress Update: The discussion highlighted the current progress on a paper dedicated to the OHDSI Phenotype Library, emphasizing the need to document and communicate the comprehensive capabilities and applications of the library to a broader audience.
  • Goals and Objectives: The paper aims to serve as a formal communication that can be referenced by other studies, enhancing the visibility and utility of the OHDSI Phenotype Library in various research settings.
  • Collaboration Opportunities: Attendees were encouraged to contribute to the paper, either by co-authoring or providing case studies and examples where the Phenotype Library has been effectively utilized.
  • Timeline and Deadlines: There was a push to prepare a pre-print by November 2024 to facilitate dissemination at upcoming conferences, specifically targeting the AMIA 2024 for showcasing the paper.

Today we will continue to work with VA CIPHER (slides from 6/28/2024 attached)
2024.06.28_CIPHER-OHDSI.pdf (3.1 MB)

  • OHDSI identify 5 phenotypes for testing
  • OHDSI enter first phenotype into CIPHER via phenotype entry webform
  • OHDSI use attached bulk upload form to map OHDSI phenotype metadata fields to CIPHER fields
  • CIPHER review OHDSI phenotype library manuscript draft

Honerlaw, Jacqueline (Guest): CIPHER-OHDSI Integration Pilot 7/11/24

posted in Workgroup - Phenotype Development and Evaluation / General at Wednesday, July 10, 2024 4:27:27 PM

summary of attached slide with GPT

Summary of VA CIPHER and OHDSI Phenotype Library Integration

VA CIPHER Overview

  1. Introduction to CIPHER:

    • The Centralized Interactive Phenomics Resource (CIPHER) is a publicly accessible platform funded by the US Department of Veterans Affairs, Office of Research and Development.
    • CIPHER aims to accelerate health data innovation by providing an integrated and interactive knowledge-sharing platform.
  2. CIPHER Features:

    • Phenotype Knowledgebase: A searchable database containing over 6,000 phenotypes.
    • Phenotype Collection Workflow: Standardized collection and curation of phenotype metadata.
    • Data Visualization Tools: Tools for exploring and developing phenotypes, including an interactive map of phenotype prevalence and relationships among diseases, treatments, and procedures.
  3. CIPHER Userbase:

    • CIPHER’s userbase has grown since 2017, expanding its collaborations and knowledgebase.
  4. Future Directions:

    • Enhancing site features, expanding the knowledgebase, integrating additional tools, and expanding partnerships.

OHDSI Phenotype Library Integration

  1. Objectives:

    • Integrate OMOP concepts into CIPHER.
    • Pilot collection of OMOP-based phenotypes with All of Us (AoU) program.
    • Connect to phenotypes developed by OHDSI experts.
  2. Proposed Approach:

    • Identify five phenotypes to pilot.
    • Collect required phenotype metadata per the CIPHER standard.
    • Enter phenotypes in the knowledgebase, referencing the OHDSI Phenotype Library for specific fields such as programming code.
  3. Goals:

    • Promote OMOP use and the OHDSI Phenotype Library within the CIPHER community.
    • Facilitate search and reuse of OHDSI phenotypes.
  4. Pilot Phenotypes:

    • Includes phenotypes for dementia, Alzheimer’s disease, cognitive impairment, ischemic stroke, stroke events, and transient ischemic attack events.

Key Participants

  • CIPHER Team:
    • Sumitra Muralidhar (VACO Lead)
    • Kelly Cho (Director)
    • Jackie Honerlaw (Deputy Director)
    • Anne Ho (Director for Data Operations)
    • Additional team members across various roles and specialties.

Integration Benefits

  • Interactive and Comprehensive Platform: Combining the comprehensive phenotype definitions from OHDSI with CIPHER’s interactive platform enhances the utility and accessibility of phenotype data for clinical research and healthcare operations.
  • Standardization and Reuse: Ensuring standardized phenotype metadata and facilitating reuse across different research and clinical projects.

References

For more information on CIPHER, visit CIPHER Online or contact CIPHER@va.gov

@jhonerlaw has posted agenda for today

Honerlaw, Jacqueline (Guest): CIPHER-OHDSI Integration Pilot 7/25/24

posted in Workgroup - Phenotype Development and Evaluation / General on Thursday, July 25, 2024 8:46 AM

We will attempt to review the proposed mapping

mapping from OHDSI CDM to VA Cipher
BulkUploadExcelFile_v7Mapping.xlsx

Example of mapped output
vaCipherMapping.csv

t