OHDSI MEETINGS THIS WEEK
OHDSI Community Call - Tuesday at 12pm ET
US TOLL: +1-415-655-0001
Meeting Number: 199 982 907
CDM and Vocabulary Working Group - Tuesday at 1pm ET
US toll-free: +1 (877) 565-9999
Attendee access code: 8810735 36
OHDSI Atlas/WebAPI Working Group - Wednesday’s at 10am ET
Population-Level Estimation (Western Hemisphere) Workgroup - Thursday at 12pm ET
GIS Working Group Meeting - Next Monday (December 10th) at 10am ET
Meeting Number: 735 317 239
Simple, modern video meetings for the global workforce. Join from anywhere, including your desktop, browser, mobile device, or video room device.
2019 European Symposium - Registration is open for second annual European Symposium, set to take place March 29th, 2019 in Rotterdam, Netherlands. For more details, including where to register, check out Peter’s post: European OHDSI Symposium Registration now open!
2019 OHDSI F2F - SAVE THE DATE! The 2019 OHDSI F2F will take place on June 3-4th 2019 at Case Western Reserve University in Cleveland, OH. For more details about the meeting, please join tomorrow’s OHDSI call
2018 OHDSI Symposium Materials - All slides, handouts and abstracts from this year’s symposium have been uploaded here: https://www.ohdsi.org/past-events/2018-ohdsi-symposium-materials/
2018 OHDSI Symposium Recording - Video records from the main symposium are available here: https://www.ohdsi.org/2018-ohdsi-symposium-videos/
2018 Tutorial Recordings - * Intro tutorial videos are online! Advanced tutorials will be posted shortly.
CDM Tutorial: https://www.ohdsi.org/past-events/2018-tutorials-omop-common-data-model-and-standardized-vocabularies/
OHDSI Ecosystem: https://www.ohdsi.org/past-events/2018-tutorials-overview-of-the-ohdsi-analysis-ecosystem/
You have to put up with the risk of being misunderstood if you are going to try to communicate.
Edie Sedgwick COMMUNITY PUBLICATIONS
Overview and experience of the YODA Project with clinical trial data sharing after 5 years
Columbia Open Health Data, clinical concept prevalence and co-occurrence from electronic health records
CN Ta, M Dumontier, G Hripcsak, NP Tatonetti and C Weng,
Scientific data, Nov 27 2018
Columbia Open Health Data (COHD) is a publicly accessible database of electronic health record (EHR) prevalence and co-occurrence frequencies between conditions, drugs, procedures, and demographics. COHD was derived from Columbia University Irving Medical Center's Observational Health Data Sciences and Informatics (OHDSI) database. The lifetime dataset, derived from all records, contains 36,578 single concepts (11,952 conditions, 12,334 drugs, and 10,816 procedures) and 32,788,901 concept pairs from 5,364,781 patients. The 5-year dataset, derived from records from 2013-2017, contains 29,964 single concepts (10,159 conditions, 10,264 drugs, and 8,270 procedures) and 15,927,195 concept pairs from 1,790,431 patients. Exclusion of rare concepts (count ≤ 10) and Poisson randomization enable data sharing by eliminating risks to patient privacy. EHR prevalences are informative of healthcare consumption rates. Analysis of co-occurrence frequencies via relative frequency analysis and observed-expected frequency ratio are informative of associations between clinical concepts, useful for biomedical research tasks such as drug repurposing and pharmacovigilance. COHD is publicly accessible through a web application-programming interface (API) and downloadable from the Figshare repository. The code is available on GitHub.
Trends in anesthesiology research: a machine learning approach to theme discovery and summarization.
A Rusanov, R Miotto and C Weng,
JAMIA open, Oct 2018
Traditionally, summarization of research themes and trends within a given discipline was accomplished by manual review of scientific works in the field. However, with the ushering in of the age of "big data," new methods for discovery of such information become necessary as traditional techniques become increasingly difficult to apply due to the exponential growth of document repositories. Our objectives are to develop a pipeline for unsupervised theme extraction and summarization of thematic trends in document repositories, and to test it by applying it to a specific domain.To that end, we detail a pipeline, which utilizes machine learning and natural language processing for unsupervised theme extraction, and a novel method for summarization of thematic trends, and network mapping for visualization of thematic relations. We then apply this pipeline to a collection of anesthesiology abstracts.We demonstrate how this pipeline enables discovery of major themes and temporal trends in anesthesiology research and facilitates document classification and corpus exploration.The relation of prevalent topics and extracted trends to recent events in both anesthesiology, and healthcare in general, demonstrates the pipeline's utility. Furthermore, the agreement between the unsupervised thematic grouping and human-assigned classification validates the pipeline's accuracy and demonstrates another potential use.The described pipeline enables summarization and exploration of large document repositories, facilitates classification, aids in trend identification. A more robust and user-friendly interface will facilitate the expansion of this methodology to other domains. This will be the focus of future work for our group.