OHDSI MEETINGS THIS WEEK
Gold Standard Phenotype Library WG meeting - Tuesday at 9am ET
URL: https://gatech.webex.com/webappng/sites/gatech/meeting/info/128704664173552569?MTID=mdd4af3e9b84212fc7df3eb0150703df5
The Book of OHDSI working group meeting - Tuesday at 11am ET
Zoom URL: https://columbiauniversity.zoom.us/j/258043190
OHDSI Community Call - Tuesday at 12pm ET
URL: https://meetings.webex.com/collabs/#/meetings/detail?uuid=M59X2V1U61WC9ASID2Z5N3UT95-D1JL&rnd=96139.930901412523321531221112212141232121131213113112112121536
Pharmacovigilance Evidence Investigation (PEI) WG call - Wednesday at 9am ET
URL: https://meet.lync.com/jnj-its/evoss3/Q7P48H1D
ATLAS workgroup meeting - Wednesday at 10am ET
URL: https://jjconferencing.webex.com/webappng/sites/jjconferencing/meeting/info/129429150951851372?MTID=mca074ae1ff6eb7d80245d36b59188c80
Population-Level Estimation WG (Western Hemisphere) - Thursday at 12pm ET*
URL: https://meetings.webex.com/collabs/#/meetings/detail?uuid=M3T9BZV9RSB6YNDM8WDDZMI19D-D1JL
You can find a full list of upcoming OHDSI meetings here: https://docs.google.com/document/d/1X0oa9R-V8cwpF1WQZDJOqcXZguPKRiCZ6XrQ2zXMiuQ/edit
ANNOUNCEMENTS
Looking for presenters for upcoming OHDSI community calls We are looking for collaborators to share their work on upcoming OHDSI calls. If you are interested in presenting on an upcoming OHDSI call please email me at beaton@ohdsi.org
2019 OHDSI Symposium - TUTORIALS There’s still time to register for tutorials at this year’s OHDSISymposium. Tutorials are set to take September 15th and 17th. More details about tutorials being offered is available here: https://www.ohdsi.org/tutorialworkshops2019/
Register for tutorials here: https://www.ohdsi.org/tutorialregistration2019/
2019 OHDSI Symposium - Women in Real-World Analytics Leadership Forum As part of the 2019 OHDSI Symposium, the Women of OHDSI group will be hosting a leadership forum which is set to take place from 6-8pm on Sunday, September 15th at the Bethesda North Marriott in North Bethesda, MD. For more details and to RSVP, check out our event page: https://www.ohdsi.org/2019-women-in-real-world-analytics-leadership-forum/
Always remember that you are absolutely unique. Just like everyone else.
Margaret Mead
COMMUNITY PUBLICATIONS
Standardized Observational Cancer Research Using the OMOP CDM Oncology Module.
R Belenkaya, M Gurley, D Dymshyts, S Araujo, A Williams, R Chen and C Reich,
Studies in health technology and informatics , Aug 21 2019
Observational research in cancer requires substantially more detail than most other therapeutic areas. Cancer conditions are defined through histology, affected anatomical structures, staging and grading, and biomarkers, and are treated with complex therapies. Here, we show a new cancer module as part of the OMOP CDM, allowing manual and automated abstraction and standardized analytics. We tested the model in EHR and registry data against a number of typical use cases.
FAIR Principles for Clinical Practice Guidelines in a Learning Health System.
TI Leung and M Dumontier,
Studies in health technology and informatics , Aug 21 2019
The learning health system depends on a cycle of evidence generation, translation to practice, and continuous practice-based data collection. Clinical practice guidelines (CPGs) represent medical evidence, translated into recommendations on appropriate clinical care. The FAIR guiding principles offer a framework for publishing the extensive knowledge work of CPGs and their resources. In this narrative literature review, we propose that FAIR CPGs would lead to more efficient production and dissemination of CPG knowledge to practice.
Extending Achilles Heel Data Quality Tool with New Rules Informed by Multi-Site Data Quality Comparison.
V Huser, X Li, Z Zhang, S Jung, RW Park, J Banda, H Razzaghi, A Londhe and K Natarajan,
Studies in health technology and informatics , Aug 21 2019
Large healthcare datasets of Electronic Health Record data became indispensable in clinical research. Data quality in such datasets recently became a focus of many distributed research networks. Despite the fact that data quality is specific to a given research question, many existing data quality platform prove that general data quality assessment on dataset level (given a spectrum of research questions) is possible and highly requested by researchers. We present comparison of 12 datasets and extension of Achilles Heel data quality software tool with new rules and data characterization measures.
Crowdsourcing Public Opinion for Sharing Medical Records for the Advancement of Science.
C Weng, T Hao, C Friedman and J Hurdle,
Studies in health technology and informatics , Aug 21 2019
This study used Amazon Mechanical Turk to crowdsource public opinions about sharing medical records for clinical research. The 1,508 valid respondents comprised 58.7% males, 54% without college degrees, 41.5% students or unemployed, and 84.3% under 40 years old. More than 74% were somewhat willing to share de-identified records. Education level, employment status, and gender were identified as significant predictors of willingness to share one's own or one's family's medical records (partially identifiable, completely identifiable, or de-identified). Thematic analysis applied to respondent comments uncovered barriers to sharing, including the inability to track uses and users of their information, potential harm (such as identity theft or healthcare denial), lack of trust, and worries about information misuse. Our study suggests that implementing reliable medical record de-identification and emphasizing trust development are essential to addressing such concerns. Amazon Mechanical Turk proved cost-effective for collecting public opinions with short surveys.
SNOMEDtxt: Natural Language Generation from SNOMED Ontology.
O Lyudovyk and C Weng,
Studies in health technology and informatics , Aug 21 2019
SNOMED Clinical Terms (SNOMED CT) defines over 70,000 diseases, including many rare ones. Meanwhile, descriptions of rare conditions are missing from online educational resources. SNOMEDtxt converts ontological concept definitions and relations contained in SNOMED CT into narrative disease descriptions using Natural Language Generation techniques. Generated text is evaluated using both computational methods and clinician and lay user feedback. User evaluations indicate that lay people prefer generated text to the original SNOMED content, find it more informative, and understand it significantly better. This method promises to improve access to clinical knowledge for patients and the medical community and to assist in ontology auditing through natural language descriptions.
Network Analysis of Citation in Hypertension Clinical Guidelines.
Y Park, HW Kim, SC You, G Hripcsak, HE Cho, JH Han, SJ Shin and RW Park,
Studies in health technology and informatics , Aug 21 2019
Recently the two most influential clinical guideline were published for diagnosing and treating hypertension in US and Europe: 2017 American College of Cardiology/American Heart Association (ACC/AHA) and 2018 European Society of Cardiology/European Society of Hypertension (ESC/ESH) Guideline. Though both of them have most in common, the differences in details between guidelines have confused many clinicians in the world. Because guidelines were evidence- based literature, through the analysis of articles cited in guidelines, these similarities and differences could be explained. Bibliometric analysis is a method of quantifying the contents of literature to analyze literature. So using the bibliometric analysis including co-citation network analysis, articles cited in guideline were analyzed. As a result, we figured out that bibliometrics can analyze the influence of the countries, authors and studies on the guidelines, which might affect on the similarities and the differences between both guidelines.
Integration of FHIR to Facilitate Electronic Case Reporting: Results from a Pilot Study.
BE Dixon, DE Taylor, M Choi, M Riley, T Schneider and J Duke,
Studies in health technology and informatics , Aug 21 2019
Current approaches to gathering sexually transmitted infection (STI) case information for surveillance efforts are inefficient and lead to underreporting of disease burden. Electronic health information systems offer an opportunity to improve how STI case information can be gathered and reported to public health authorities. To test the feasibility of a standards-based application designed to automate STI case information collection and reporting, we conducted a pilot study where electronic laboratory messages triggered a FHIR-based application to query a patient's electronic health record for details needed for an electronic case report (eCR). Out of 214 cases observed during a one week period, 181 (84.6%) could be successfully confirmed automatically using the FHIR-based application. Data quality and information representation challenges were identified that will require collaborative efforts to improve the structure of electronic clinical messages as well as the robustness of the FHIR application.
Construction of Disease Similarity Networks Using Concept Embedding and Ontology.
DH Wei, T Kang, HA Pincus and C Weng,
Studies in health technology and informatics , Aug 21 2019
Discovering disease similarities are beneficial for the diagnosis and treatment of mental diseases. In this research, we proposed a data driven method, that is, integrating a variety of publicly available data resources including Unified Medical Language System (UMLS) Metathesaurus, Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) and cui2vec concept embedding to construct a mental disease similarity network. The resulting mental disease similarity network offered a new view for navigating and investigating disease relations; it also revealed popular mental disease in the literature in terms of the number of connections and similarities with other diseases. It shows that depressive disorder is directly connected with nine other popular diseases and connects 52 other diseases in the network. The top three popular mental diseases are depressive disorder, dysthymia (now known as persistent depressive disorder), and neurosis. Future research will focus on studying the clusters generated from the similarity network.
Detecting Systemic Data Quality Issues in Electronic Health Records
CN Ta and C Weng,
Studies in health technology and informatics , Aug 21 2019
Secondary analysis of electronic health records for clinical research faces significant challenges due to known data quality issues in health data observationally collected for clinical care and the data biases caused by standard healthcare processes. In this manuscript, we contribute methodology for data quality assessment by plotting domain-level (conditions (diagnoses), drugs, and procedures) aggregate statistics and concept-level temporal frequencies (i.e., annual prevalence rates of clinical concepts). We detect common temporal patterns in concept frequencies by normalizing and clustering annual concept frequencies using K-means clustering. We apply these methods to the Columbia University Irving Medical Center Observational Medical Outcomes Partnership database. The resulting domain-aggregate and cluster plots show a variety of patterns. We review the patterns found in the condition domain and investigate the processes that shape them. We find that these patterns suggest data quality issues influenced by system-wide factors that affect individual concept frequencies.
A Privacy-Preserving Infrastructure for Analyzing Personal Health Data in a Vertically Partitioned Scenario.
C Sun, L Ippel, J van Soest, B Wouters, A Malic, O Adekunle, B van den Berg, O Mussmann, A Koster, C van der Kallen, C van Oppen, D Townend, A Dekker and M Dumontier,
Studies in health technology and informatics , Aug 21 2019
It is widely anticipated that the use and analysis of health-related big data will enable further understanding and improvements in human health and wellbeing. Here, we propose an innovative infrastructure, which supports secure and privacy-preserving analysis of personal health data from multiple providers with different governance policies. Our objective is to use this infrastructure to explore the relation between Type 2 Diabetes Mellitus status and healthcare costs. Our approach involves the use of distributed machine learning to analyze vertically partitioned data from the Maastricht Study, a prospective population-based cohort study, and data from the official statistics agency of the Netherlands, Statistics Netherlands (Centraal Bureau voor de Statistiek; CBS). This project seeks an optimal solution accounting for scientific, technical, and ethical/legal challenges. We describe these challenges, our progress towards addressing them in a practical use case, and a simulation experiment.
A plea to stop using the case-control design in retrospective database studies.
MJ Schuemie, PB Ryan, KKC Man, ICK Wong, MA Suchard and G Hripcsak,
Statistics in medicine , Aug 22 2019
The case-control design is widely used in retrospective database studies, often leading to spectacular findings. However, results of these studies often cannot be replicated, and the advantage of this design over others is questionable. To demonstrate the shortcomings of applications of this design, we replicate two published case-control studies. The first investigates isotretinoin and ulcerative colitis using a simple case-control design. The second focuses on dipeptidyl peptidase-4 inhibitors and acute pancreatitis, using a nested case-control design. We include large sets of negative control exposures (where the true odds ratio is believed to be 1) in both studies. Both replication studies produce effect size estimates consistent with the original studies, but also generate estimates for the negative control exposures showing substantial residual bias. In contrast, applying a self-controlled design to answer the same questions using the same data reveals far less bias. Although the case-control design in general is not at fault, its application in retrospective database studies, where all exposure and covariate data for the entire cohort are available, is unnecessary, as other alternatives such as cohort and self-controlled designs are available. Moreover, by focusing on cases and controls it opens the door to inappropriate comparisons between exposure groups, leading to confounding for which the design has few options to adjust for. We argue that this design should no longer be used in these types of data. At the very least, negative control exposures should be used to prove that the concerns raised here do not apply.