OHDSI Home | Forums | Wiki | Github

Big Data search for effective medications


(Robert Clark) #1

OHDSI collected data on nearly 1,000,000 cases of people on hydroxychloroquine. They released a research report on HCQ’s safety. However, I am dismayed they haven’t used this data to answer the key question: is hydroxychloroquine protective against COVID-19?

Back in March when the news first broke that HCQ might be curative or protective against COVID-19, there were only a few hundred COVID-19 deaths in the U.S. Now it’s a hundred times that. The key importance of the big-data approach for determining the effectiveness of a medication is that the question can be answered in a matter of days, whereas setting up and conducting a randomized, controlled drug trial would take months. And in the case of an emerging epidemic undergoing exponential growth, every day is vital.

Note the importance of this goes beyond just HCQ. For instance interferon also has been reported in some small trials to be effective against COVID-19. Interferon is in use by many people for treatment of several illnesses such as hepatitis and multiple sclerosis. Then big-data can also be used to determine if interferon is protective against COVID-19.

And if you think about it, you realize big-data can be used to find effective medications for any disease.

See:

Robert Clark


Robert Clark
Dept. of Mathematics
Widener University
One University Place
Chester, PA 19013 USA



Research questions that the OHDSI community can potentially answer to suport the COVID-19 response
(Julianna Kohler) #2

There is an intention to study HCQ as it pertains to COVID, but we have to actually have enough patients who have COVID who also are taking HCQ to be able to study the question.


(Alan Goldhammer) #3

@Robert_Clark There really isn’t a point in OHDSI doing this. There are so many clinical trials of hydroxychloroquine going on that I have lost count. In the last two weeks Duke University and University of Pennsylvania have launched large ones in healthcare workers. Duke has a $50M grant to set up a registry in addition to the trial: https://today.duke.edu/2020/04/duke-lead-50-million-study-covid-19-prevention-health-care-workers . Our local hospital which is a Johns Hopkins affiliate and right across the street from the National Institutes of Health is routinely using hydroxychloroquine + azithromycin which is not a good idea from a safety perspective. For me the bigger concern is that with all the hydroxychloroquine that is being used, it’s going to be very difficult to conduct normal clinical trials.

One thing I posted on one of the threads last week was the possibility of looking in New York City case records about patient outcomes of those already taking taking emtricitabine/tenofovir or lopinavir/ritonavir combinations for HIV treatment or prophylaxis. Toronto is going to start a Ring trial with lopinavir/rtionavir soon. I don’t know how easy it would be to do the HIV drug outcome study in New York but the patient population suggests it is a reasonable thing to do.


(Nigel Hughes) #4

And this reinforces the challenge with HCQ being used widespread without sufficient data: Is Hydroxychloroquine Making COVID-19 Clinical Trials Harder? - https://www.medscape.com/viewarticle/928719. Novartis announced today a study of HCQ +/- AZM and with a placebo arm.

In terms of the HIV drugs, the in vitro data is not convincing for them, and indeed Chinese studies (albeit complicated by e.g. use of intra bronchial IFN or indeed HCQ) have been disappointing with LPV/RTV, for instance, e.g.: https://www.nejm.org/doi/full/10.1056/NEJMoa2001282)

Certainly, being able to observe in real world patients potential differences in infection, disease severity and/or outcomes in people taking certain drugs could help substantiate in vitro data inasmuch as where to look and study deeper.

Ultimately it’s in my mind all about timing - can we find evidence in the midst of a significant lack of e.g. prospective data and challenges to clinical equipoise, which then informs planning and clinical decision-making today and not in several tomorrows.


(Robert Clark) #5

Thanks for the info. That’s the origin of the suggestion I made that the FDA put forth a recommendation that all doctors treating COVID-19 patients send the patients health histories to a central database, with patients privacy protected by a randomized number.

Robert Clark


(Robert Clark) #6

Thanks for that update on those trials. However, I’m always in favor of more data. Instead of there being a few hundred or a few thousand cases in a drug trial. Collecting all the health histories would result in hundreds of thousands to millions of cases, resulting in an unprecedented level of statistical strength in the conclusions you can draw.

Also, as I said time is paramount. I maintain that assuming this order had been made in March to send in all COVID-19 patient histories, when there were only a few tens of thousands of cases and a few hundreds of deaths, the question could have been answered within a matter of days, if HCQ is curative or protective or not. IF it were curative or protective, tens of thousands of lives could have been saved. But just as importantly if not more so, the data provided could have been used to find medications that might have been even better.

Keep in mind a key aspect of the proposal, is not just to check medications already suggested to be effective, but to find ones where it had not even been suspected they were effective.

Robert Clark


(Dalia) #7

I’d agree with Robert that there definitely is an added value of having this analysis undertaken by OHDSI. It would be helpful too in validating the results of the clinical trials when they come out. If the data is available, then it would be great to use it.

Dalia Dawoud


(Anna) #8

Just heard a database newly created for covid19 patients: https://covid19researchdatabase.org/


(Kristin Kostka, MPH) #9

Hi @Robert_Clark!
OHDSI is moving as quickly as possible to produce reliable evidence from the data available from its partners.

Many OHDSI collaborators are actively participating in covid.cd2h.org. Huge props to Chris Chute and team (@DaveraGabriel) at Hopkins for leading this effort!

Other national efforts would be great. We applaud the communities actively rallying to create these large data sets.

In parallel, @msuchard @conovermitch @schuemie @Daniel_Prieto @Patrick_Ryan @hripcsa and others are extensively evaluating the methodologies we use for large scale observational analysis and their appropriateness for addressing these kinds of questions. Just as important as it is to get data, it’s important we use the right methods for making these inferences. If we’re being honest, it’s going to require some work to do this right.

As my favorite professor in undergrad used to say, “first one to figure it out should tell the rest.” :wink: If you’ve got insight on methods development or are a data owner, feel free to respond to @CraigSachson and others when we post information on calls for participation in various efforts.


(Alan Goldhammer) #10

Talk to Oracle, they said they were creating something that doctors could use. Remember there has been a very large amount of community usage of hydroxychloroquine such that lupus and RA patients who need the drug can’t find it. How do you capture the health status of these individuals in the absence of a robust testing regime and universal EHR system. Are community physicians even inputing data somewhere? One of the reasons OMOP was created back in 2005 was to be able to set up queries across disparate datasets and one of the problems that was addressed early on was figuring out vocabularies and definitions (and this is still going on today). I’m very sympathetic to this line of study but the bigger concern is that promising drugs are not going into trials nor are they being used in community settings.

Maybe the two antiviral combinations I mentioned don’t work. We have had one published study from China and I have not seen any of the other ongoing lopinvir/ritinovir trials being shut down and as noted Toronto just got approval for a prophylaxis study which was not addressed in the Wuhan paper.

My point still stands. There should be data in New York that could be observationally examined. I’m not proficient enough to say how difficult this would be.

OHDSI already did one study showing that it is unwise to co-administer azithromycin with hydroxychloroquine. It’s doubtful that finding is being followed.


(JD Liddil) #11

So as some who was in Pharma and has done Clinical Trails I tend to follow what Fauci says. We need a real trial. Not some one Illegally promoting off label usage and combining it with other drugs. Particularly in patients who have various comorbities. We are going to kill folks for no reason. Remember drugs (and vaccines) need to be safe and effective. Just because governments have mismanaged this epidemic does not mean health care workers should willy nilly give drugs. This is not the dark ages.Big Data are great IF you have good data entry and abstraction. Having dealt with data abstraction a lot of it is crap that is in Cerner and EPIC etc.


(JD Liddil) #12

My interactions with oracle and their original EDC system leave a bad taste in my mouth.


(Kristin Kostka, MPH) #13

We’re very fortunate to have Columbia University as the OHDSI Coordinating Center. These data are already actively being contributed to COVID-19 specific analyses.

100%, @jliddil1. Biggest piece of what we’re doing right now across the OHDSI network is engaging data partners to understand their ETL processes, coding variations and quality checks. @clairblacketer led a great demo yesterday for the CD2H team to talk about how the DQ Dashboard and similar activities could help shed light on data quality. You know we always welcome your input. :slight_smile:


(JD Liddil) #14

Often both in house and contract data entry people do a poor job. We get EMR dumps for cancer treatment that are full of errors. Data issue, disease coding issues etc. Since there are no validation checks on the front end like with an EDC system we constantly go back to centers to correct data. Since they are “busy” they rarely get around to correcting things. We are now using some NLP and machine learning to attempt to deal with these issues. Perhaps moving forward we can have systems to deal with the next pandemic. Right now we are scrambling to deal with a poorly coordinated approach to dealing with the disease and testing. The test themselves have reliability issues. So even confirming someone has the virus is an issue.


(Alan Goldhammer) #15

Fantastic on the antivirals. I’ll look forward to seeing the results. I’m so excited by this whole activity and the way everything has come together.:+1:


(Christian Reich) #16

You are saying these are due to mistakes of the folks doing data manipulations, or are these data entry problems at the point of care?

We are working on the Data Quality Dashboard, which should give you some semblance of edit checks. But the quality will never reach clinical trial data.


(JD Liddil) #17

On the data entry side. So garbage in/garbage out


(Robert Clark) #18

Thanks for that. I’ll give it a try.

Robert Clark


(Robert Clark) #19

Thanks @krfeeney for the link to covid.cd2h.org.

Robert Clark


(Robert Clark) #20

In reading over the report, “Safety of hydroxychloroquine, alone and in combination with azithromycin, in light of rapid widespread use for COVID-19”, it appears the patients whose data was collected weren’t tested for COVID-19 at the time. So it would take an additional expense to track these patients down and administer COVID-19 tests. In other words, it would not just be a matter of data collating. And even then you wouldn’t know when the patients contracted it.

There is also a puzzling line in the report in Table 1, p. 16 in the list of patient health conditions:

It’s the line in the selection above showing the percentage of “Acute respiratory disease”. The numbers are quite high across the table. But HCQ is now given in the 6 industrialized nations that the cases were taken from for disorders like lupus and rheumatoid arthritis, not for things like malaria. I don’t believe ARDS has a high incidence in those illnesses. So was this high incidence of ARDS because a large number of these cases were taking HCQ as treatment for COVID-19?

That complicates matters if so. Because we’re trying to find cases that were already taking HCQ long term and then see if they contract COVID-19, not those who started taking HCQ because they had already contracted COVID-19.

We might still be able to use this fact however, if we do have access to these patients follow-up health histories. We could see how many recovered from the COVID-19 on HCQ and the various combinations with and without HCQ.

Because of these extra complications I still prefer the idea of the FDA putting out a request to all doctors treating COVID-19 patients to send in their health histories to see how many of them were on HCQ. This would be just a matter of data collating, so much cheaper to do.

Robert Clark


t