OHDSI Home | Forums | Wiki | Github

OHDSI virtual study-a-thon to support COVID-19 response, to take place 26-29Mar2020...Collaborators wanted!

(Patrick Ryan) #1


In light of the current uncertainty around COVID-19, we have decided to cancel the in-person OHDSI EU Symposium, which was scheduled to take place 27-29Marc2020 in Oxford, UK. An announcement about this is available here. In lieu of this large meeting event, we have decided to coordinate a virtual OHDSI study-a-thon to take place over the same period that will be focused on generating real-world evidence that can inform the current COVID-19 pandemic response.

We expect there are many ways the OHDSI community can contribute to the current situation:

  • Characterisation of symptoms and complications of viral diseases
  • Prediction of adverse outcomes amongst patients with virus-related hospitalization
  • Comparative safety of treatments being considered/used for potential use in COVID-19

Some of these questions can be immediately addressed with historical data from administrative claims, electronic health records and other sources already available to our community. For others, we can be prepared to validate any findings in databases as COVID-19 cases become available.

Over the past several days, several of us have been reaching out to colleagues at CDC, FDA, EMA, NHS, as well as academic experts across US, Europe, and Korea to understand the current evidence gaps and how retrospective analysis of existing observational data could meaningfully inform public health decisions. While the vast majority of the OHDSI community does not have real-time data collection to support COVID-19 case detection, we have identified several specific questions for which OHDSI network analyses of existing data can meaningfully inform. To highlight a couple examples:

  • In parts of Korea, some hospitals have reached capacity, in overall bed or within their ICUs, which has resulted in symptomatic patients being turned away without medical care, some of whom subsequently died at home. Therefore, hospitals are looking for more effective ways to triage patients to determine how to prioritize their limited resources. Drs. Seng Chan You and Rae Park (Ajou University; @SCYou @rwpark ), in collaboration with these hospitals, has proposed an OHDSI network prediction study to use influenza as a model viral infection, and to create a risk score for flu-related complications (pneumonia, ICU use, oxygen and ECMO therapy) and mortality that could be used to determine which patients are at highest risk (reaching a greater level of specificity than the current US CDC guidance that ‘Older adults and people who have severe chronic medical conditions like heart, lung or kidney disease seem to be at higher risk for more serious COVID-19 illness’. While we will initially train this prediction model using flu models, Dr. You is working with Korean government officials to potential access national claims data in less than one months time which will allow us to apply and validate the prediction model on COVID-19 cases, and consider how this model can be used to supplement current hospital admitting guidelines across the OHDSI Korea network of hospitals.
  • Dr. Marc Suchard (UCLA; @msuchard ) is a leader in the evolutionary biology community examining COVID-19, and has said that folks within his community particularly concerned about long-term sequelae of aggregative treatments, including ECMO and extended RICU stays, as well as safety of antivirals that are being considered candidates for widespread prophylaxis. While OHDSI will not have data to examine effectiveness of treatments on COVID-19, we can look to characterize the real-world experience of patients who have received these treatments for other viral diseases.
  • Dr. Dani Prieto-Alhambra (Oxford; @Daniel_Prieto ) said colleagues he contacted in Spain are experimenting with use of hydroxychloroquine as a potential treatment, with little evidence to support its application beyond a biologically-plausible hypothesis. Given that we examined the comparative effectiveness of hydroxychloroquine vs. other DMARDs in our RA study-a-thon, we may be able to re-purpose our prior study to examine effects on the incidence of other viral diseases. Dani was told such as a study would directly impact ongoing research into these treatments.
  • Multiple groups have expressed that they do not have sufficient confidence in the use of claims and EHRs to study the common symptoms of COVID-19 (including fever, cough, dyspnea, malaise/fatigue) or complications (including pneumonia or acute respiratory distress syndrome). Developing and evaluating phenotypes for the constellation of diagnostic patterns observed during prior viral outbreaks, including the past several flu seasons, could provide important context for understanding the background rate of these symptoms, which can be helpful to guide policy and risk communication. FDA and CDC have both said they will come back to me with more specific analysis needs in anticipation of the OHDSI event.

Many of you in the OHDSI community are well-connected with other public health officials and may have other insights into important questions that we can answer using the OHDSI data network. We need to use the breadth of the OHDSI community to reach out to those who can benefit from reliable real-world evidence to determine their specific needs and how that evidence could be used to impact their decision-making. We also need the depth of expertise across the OHDSI community to take these important public health questions, and translate them into scientific best practice study designs and analysis source code, and we need the power of the OHDSI international data network to execute these analyses and share aggregate summary results so that we can disseminate reliable evidence to those who need it as quickly as possible. I personally feel very compelled to do whatever is in my capacity to help the current COVID-19 epidemic; these are exactly the situations when we have to bring together the best talent and science to do whatever we can for the patients we serve.

How will the virtual OHDSI study-a-thon work?

The study-a-thon will take place from 26Mar2020-29Mar2020. We will form the virtual core team, which will include myself, Dani Prieto-Alhambra (Oxford), Peter Rijnbeek (Erasmus MC @Rijnbeek ), George Hripcsak (Columbia @hripcsa ), Christian Reich (Iqvia @Christian_Reich ), Martijn Schuemie (Janssen @schuemie ), Seng Chan You (Ajou), Rae Woong Park (Ajou), Marc Suchard (UCLA) who will lead the OHDSI community effort. We will be meeting via TC/web conference with daily sessions planned to work for every timezone to accommodate participation around the world, with particular emphasis on supporting the needs of our colleagues in Korea. We will then have remote sites across North America, Europe and Asia-Pacific regions that will participate through web conferencing and though public exchange of study documentation, analysis code and results. We encourage everyone to be responsible and not incur any unnecessary risks to travel, and expect that some individuals may opt to participate from their home, while others may congregate in small groups within their respective institutions.

What can you do right now?

  1. If you are interested to participate in the virtual study-a-thon, block your calendar from 26Mar-29Mar2020. Please post your willingness to participate as a citizen scientist by filling out the Google form here, providing your contact information and sharing what you think you can contribute during the study-a-thon.

  2. If you have access to patient-level data that is formatted in OMOP CDM format, and would be willing to execute OHDSI network analyses against your data and share back aggregate summary results to the global effort, please post your willingness to participate as a data partner in this Google form here. You’ll be asked for contact information and high-level overview of your database. For those who agree to participate, we’ll send out a small R script that characterizes your data, in terms of population size, years of data capture, longitudinality of follow-up, data domains covered, and vocab version.

  3. If you have research questions to support the COVID-19 response that you think the OHDSI community can potentially answer, then post them on the OHDSI forums. I will start a thread that you can add to for this purpose. For your question, please answer:

  • What is the decision we are trying to inform?

  • Who is the decision-maker?

  • What type of real-world data is needed to generate reliable evidence?

  • How will reliable real-world evidence inform the decision?

  • (we’ll presume we know the answer to ‘When is the evidence needed?’ = as soon as possible!)

    We will be using the responses on this thread to prioritize our collective efforts during the virtual study-a-thon event.

  1. If you know of others who could either potentially benefit or meaningfully contribute to our OHDSI community efforts, please send them this forum thread and encourage them to ‘join the journey’.

Thank you in advance for coming together for this important effort. Let’s use this challenge time as an opportunity to re-affirm what OHDSI is all about: to improve health by empowering a community to collaboratively generate the evidence that promotes better health decisions and better care.

Research questions that the OHDSI community can potentially answer to suport the COVID-19 response
[COVID-19] OHDSI Statement
Weekly OHDSI Digest - 23Mar2020
Weekly OHDSI Digest - 16Mar2020
(Patrick Ryan) #2

(Rob Konterman) #3

Dear Patrick and team,

Great that you are pulling together to make this happen. Real world data/evidence will be instrumental in combating this outbreak. I want to let you know that you can use our data capture platform for free for non-commercial studies / registries related to Corona. We enable you to load the WHO eCRFs and get the study running in less than one minute, for more info see:

Or contact me at rob@castoredc.com

Rob Konterman
Castor EDC

(Seng Chan You) #4

We might need a convention to map WHO’s eCRF to OMOP-CDM.

(Julianna Kohler) #5

This event from the Global Digital Health Network may also provide a useful opportunity to understand the response and what questions real world evidence is best suited to answer: https://www.ictworks.org/ict4d-covid-response/#.XmZaTb1Kg2w

(Vojtech Huser) #6

speaking of CRFs, here is CDCs form https://www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf

Also, the script

we’ll send out a small R script that characterizes your data, in terms of population size, years of data capture, longitudinality of follow-up, data domains covered, and vocab version.

Looks like a general need for any study. Similar to prior efforts on MIAD (minimum information about a dataset).

(Keesvanbochove) #7

We can use the clinical trial mapping conventions that we presented/discussed in the Clinical Trials WG. But many of the field in the eCRF are very specific to tracking this virus outbreak, so we will likely need to add a number of vocabulary terms as well.

(Talita Duarte Salles) #8

Thanks everyone for this great initiative, I’m happy to be on board and help in anything I can. For now, I’m working with @Daniel_Prieto and @edburn to get the approvals needed for obtaining access to the SIDIAP database in Catalonia-Spain for the study-a-thon.

(Alexander Davydov) #10

Thank you. We will try to put everything together identifying the comprehensive list and vocabulary gaps.

Sure, will look into them.

(Ed) #11

Data from Italy which may be of interest: https://github.com/pcm-dpc/COVID-19

(Julianna Kohler) #12

Hi folks–related to the workshop that I referenced earlier, they are starting to collect study questions and plan for the workshop. I am pasting the text from the email I received; given the global representation we have in OHDSI, I think there will be relevant questions across the board to incorporate into the study-a-thon!

As part of Thursday’s COVID Response Workshop from 8:30-Noon Eastern time, we will have Challenges Breakout Sessions where working groups will dive into current Surveillance, Prevention, Diagnosis, and Treatment issues for coronavirus response. 

We will look at each area to develop uses cases faced by four different cadres of people: Client, Provider, Manager, and Policy Maker. These use cases will help governments, donors, and health workers understand which digital health application would be the best solution for their context.

Please help us think through the questions each working group should be asking, on this Google Doc:

Thanks in advance,

PS: You RSVP’ed to this event already, right? Over 400 people will be joining us - don’t miss out!

(Darwin) #13

This is a great idea! Is there a current list of volunteers / data partners so we can see who is included? Curious if there is already people from my institution there.

(Aaron Abend) #14

I am very interested in helping with this work – but I do not work at an institution with data. If anyone wants help with this, please let me know.


(Jenny Yeon Hee Kim) #15

As a health data/public health researcher in Seoul, I truly think this is a great opportunity to support the current COVID19 outbreak. I’d love to join the work and have signed up for the upcoming data-thon! Let me know if anyone needs an assistance in any task meanwhile

(Rimma Belenkaya) #16

“At Mount Sinai School of Medicine, NYC, physicians have identified specific patterns in the lungs as markers of the disease as it develops over the course of a week and a half.” https://www.mountsinai.org/about/newsroom/2020/mount-sinai-physicians-the-first-in-us-analyzing-lung-disease-in-coronavirus-patients-from-china-press-release

They studied about 500 CT-Scans of Chinese patients along with some additional clinical data.

Mount Sinai has an OMOP instance (which is also a part of the NYC-CDRN), this could be another opportunity supporting this effort.

(Seng Chan You) #17

I’ve added novel vocabulary of Korea (KCD-7) for COVID-19 to here. Needs standard vocabulary for them.
Sorry most of them are not in English. I’ll translate them.

(Daniel Rubin) #18

@Yeon_Hee_Kim how does one sign up for the data-thon?

(Seng Chan You) #19

@dlrubin Please see below:

(Dmytry Dymshyts) #20

We have just released 2020 March SNOMED version that includes novel coronavirus concepts.

To accommodate for immediate need for concepts to represent ideas related to novel COVID-19 diagnostics, treatment and research, IHTSDO has released interim version of SNOMED International on 9th of march.

In OMOP CDM, SNOMED Vocabulary is built from three different sources: SNOMED International release, SNOMED US edition and SNOMED UK edition. Latter two are based on International release.

We were intending to delay January until April to release to synchronize with local editions to reduce inconsistencies and standard concept duplication. However, we had to force the update today to include changes related to novel coronavirus pandemic. Plans to synchronize releases have been delayed until July release, which will hopefully be finalized this October.

Updates to LOINC and ICD10 are in progress.

Here’s the corresponding github issue.

(Patrick Ryan) #21


Wow, what a difference a week makes! Last week, we announced we’d hold a OHDSI virtual study-a-thon on COVID-19 in lieu of the OHDSI EU Symposium, not knowing exactly how we’d do it or what we’d do and mildly anxious that we only had three weeks to prepare for the event. Ah, the good old days, when the biggest anxiety in life was meeting logistics…

In the last 7 days, I have been tremendously impressed by how the OHDSI community has rallied together to figure how how we can collaborate to collectively generate evidence to promote better health decisions and better care. We are still very much looking forward to the OHDSI virtual study-a-thon on Mar26-29, but it very clear that the work needed to support public health can’t and isn’t waiting to start for two weeks, and whatever we do during the study-a-thon isn’t going to be the end of our efforts to inform the COVID-19 pandemic response. Unfortunately, it looks likely that we are going to be on this journey together for quite awhile.

The good news is that it’s also become abundantly clear that we can (and will!) make meaningful contributions to the current crisis through responsible analysis across our international data network. There are a lot of open questions that real-world evidence is best equipped to answer, either because observational data is the best or only current source of information, or because retrospective analyses of these data is the most efficient way to generate insights to address the immediate urgency until prospective data collection and research can be completed.

I do want to reinforce the differentiated value that I see we can make as a community. Lots of groups around the world are working hard to do COVID-19 case detection, and there is quite a bit the world will learn from those cases. But much of what we will learn can’t be determined today, rather it will unfold over the next several months as the disease progresses and prospective data collection of the longitudinal experience of these patients is captured. A key strength of our OHDSI community is our ability to design and execute retrospective analysis of existing healthcare data across our international data network. Together, we have access to the historical experience from hundreds of millions of patients, and it is our responsibility to learn whatever we can from those past experiences to inform the current situation…now not later. So, for those of you on the analytics side of the house, I encourage all of you to not idly speculate: ‘what would I want to do if I eventually get access to data with COVID-19 cases?’ but rather take action on the question: ‘how can I learn from the data I currently have access to?’.

And for those of you with a data focus, the efforts you are putting in today to standardize your data, be it conversion of legacy warehouses or real-time data streams, will pay dividends. I am certain that it’ll remain the case that no one data source will have sufficient information to meet our needs, so the only viable path forward is to work together as a network and learn from our collective resources. The prospective data collection underway today will become the retrospective analysis opportunity tomorrow, and we need to be prepared on both the data and analytics side to realize the potential of what we can deliver for the public health good.

A quick update on community activities that I am aware of:

@SCYou is doing heroic work, leading efforts in Korea to use recent national claims data from HIRA, standardizing the HIRA to OMOP CDM and preparing to apply OHDSI analysis packages. Chan, you are a true inspiration to all of us. Whatever we can do to support you, you’ve got it.

@Daniel_Prieto has been in close contact with UK NHS to determine how OHDSI can support their needs. We are hopeful that we will be able to use more recent CPRD and HES data to inform our current activities.

@mvanzandt is leading efforts to get data refreshed across various Iqvia datasets so that the most recent data can be made available for distributed analysis. I have heard that others in the community are also hard at work trying to get whatever is the most recent data possible accessible in their OMOP CDM instance. Thank you for these efforts, they will make a difference.

@Christian_Reich, @Alexdavv , @Dymshyts and entire vocabulary team are hard at work critically reviewing the source codes and standard concepts, with particular focus on diagnosis of viral disease, symptoms and complications, respiratory procedures, and associated measurements. In addition to adding new standard concepts for COVID-19 diagnosis and tests, they will be revising concept_relationships to enable more accurate analysis. Christian will announce to community when a new vocab is released, which we will encourage everyone in the OHDSI community to download and refresh your ETL so that we’re all on the same page.

@jennareps, @Rijnbeek and I have begun designing patient-level prediction studies. Across the world, towns, states, and countries are all aiming to ‘flatten the curve’, by employing a series of public health measures aimed at delaying the communal transmission of COVID-19 such that the number of infected patients are sufficiently spread out over time that the capacity of our healthcare system can accommodate the needs of the ill. Certainly all people should be heeding the advice of their government and public health officials, useful information from US CDC is here: https://www.cdc.gov/coronavirus/2019-ncov/index.html. In situations where demand for health services exceed supply, prioritization tools based on disease severity become valuable and education for the public about why they shouldn’t seek unnecessary care can make a difference. We are proposing 3 patient-level prediction studies that can inform this discussion: 1) Amongst T: persons with an outpatient visit (GP, urgent care, ER) who have flu or flu-like symptoms, who are O: persons who are admitted to hospital with flu or pneumonia, in TAR: 0d-30d from outpatient visit? The goal of this prediction is to identify, based on the medical history prior to the first encounter, which patients are likely going to need hospitalization. 2) Amongst T: persons with an outpatient visit (GP or urgent care) who have flu or flu-like symptoms but do not have pneumonia and who are not admitted to hospital on the same day or next day, who are O: persons who are admitted to hospital with pneumonia, in TAR: 2d-30d from outpatient visit? The goal with this analysis is to help with assurance of the public that if they are sent home, the likelihood of something bad happening is low so they should follow doctor’s advice and CDC recommendations for self-care. 3) Amongst T: persons with inpatient admission with pneumonia due to viral origin, who are O1: persons requiring intensive care services or O2: persons who die, TAR: during the hospital stay? The goal here is to use medical history information prior to admission to help with triaging those who arrive at hospital to determine which cases are likely to be more severe.

@schuemie and @Daniel_Prieto have begun designing population-level estimation studies. Based on feedback we’ve gathered from clinicians across the world, it seems several areas are attempting use of various agents as treatment for COVID-19-positive patients, such as hydroxychloroquine, antivirals like protease inhibitors and remdesivir, and immunosuppressants like tocilizumab, despite little immediate COVID-19 efficacy evidence and uncertainty about the long-term safety of these medicines in these off-target populations. Prophylactic use of these medicines for asymptomatic patients who were potentially exposed to COVID-19 is being discussed. These developments further increase the need to more comprehensively understand the real-world effects of these products overall, within subpopulations with existing viral disease, and among subgroups at higher risk for COVID-19 complications. Comparative cohort and self-controlled analyses will be developed to examine causal effects of exposure on incidence of viral disease and associated complications, as well as safety outcomes.

We recognize the need to design characterization studies. FDA has encouraged us to prioritize work to delineate the natural history of the disease where possible, but also believe that developing and evaluating phenotypes for the constellation of diagnostic patterns observed during prior viral outbreaks, including the past several flu seasons, would be useful as foundational research which can be utilized not only for the current outbreak purposes but also for other similar viral diseases research and surveillance in general. @Christian_Reich and I are coordinating the development and evaluation of phenotype algorithms to identify persons with viral diseases and associated symptoms and complications, as well as anticipated treatments, which can be used as the cohort inputs into our characterization, estimation, and prediction studies. The quality of our analyses are largely depending on the quality of our phenotypes, so it is important that we use clinical expertise, data domain knowledge, and empirical tools to produce definitions that we can consistently apply across the community, and not end up with an array of different half-baked definitions that will add noise to our evidence generating process. We will be using the new CohortDiagnostics package to support our iterative development, and PheValuator to estimate measurement error where appropriate. We will use atlas.ohdsi.org as a landing place for all finalized conceptsets and cohort definitions (thanks @lee_evans for your help in administrating that!).

How can you help?

  1. If you are interested in participating in the study-a-thon, but haven’t yet signed up, please do so here.

  2. If you are one of the 103 people who have already signed up for the study-a-thon, thank you! If you marked that your expertise is in literature review and evidence synthesis, then you can expect to receive a private email from me enlisting your support on some preparatory research. So the whole community knows what I’ll be asking, there are already multiple prediction models that have created in and around the space of community-acquired pneumonia, but they vary substantially by their target population, outcome definition, decision context, what information is needed as predictors, and the extent to which they were developed or evaluated on observational data like we have within our community. We need to summarize this prior knowledge to inform our future prediction designs. Also, past work has been published in defining and validating phenotypes for viral disease, symptoms, and complications, but it has not been compiled together to examine the heterogeneity of approaches and determine a best practice moving forward. We need to summarize what is known about the safety and effectiveness of treatments under consideration for COVID-19 to find the specific gaps that we can fill. If we can do an OHDSI-community-crowd-sourced literature review on these topics, it will lay a solid foundation for our future research together. We need leaders to coordinate the community effort, so that’ll be my ask.

  3. If you have access to a patient-level data in OMOP CDM and haven’t yet done so, please run the OHDSI concept prevalence study, led by @aostropolets . We are using this aggregate summary information to directly inform the phenotype development process to ensure that as we define conceptsets, we are representing all concepts that are actually captured across the community as completely as possible. It has already proven invaluable to figure out the different ways that flu-like symptoms are showing up around the world, and will only increase in value as more and more of you participate.

  4. If you have access to a patient-level data formatted in the OMOP CDM and can execute analyses to support the cause, please sign up here. All analyses will be prepared as R packages, which you’ll be able to build and execute in RStudio, and which will produce aggregate summary statistics (no patient-level data) which can be shared to the study coordinate for synthesis across the participating data partners. If you haven’t set up your local environment yet to run an OHDSI network study, useful resources are available in the Book of OHDSI, and Anna’s OHDSI concept prevalence study study would be a great first study to cut your teeth on.

[OHDSI COVID-19 response] Community Update 17 March 2020
Urgent: Vocabulary Release for the upcoming Covid-19 Study-a-thon