OHDSI Home | Forums | Wiki | Github

New study proposal - prediction of hospitalization for covid19 amongst younger people

(Daniel Prieto-Alhambra) #1

Dear all,
I’ve been discussing a bit with @jreps , @msuchard , @Patrick_Ryan @SCYou @krfeeney @hripcsa and others the possibility of training new models for the prediction of hospital admission in younger people with covid. The reasons I think this is important (and altogether a whole different model vs COVER) are two: 1.there is an over-representation of young people amongst those affected in the upcoming/ongoing 2nd wave; and 2.the clinician still beating inside me tells me predictors of outcomes amongst youngsters are going to be very different.
This is therefore the T, TAR and O I would propose:

  • T - people age <50 years old diagnosed with COVID19 or tested+ before or not admitted to hospital on that same day (ie diagnosed in an outpatient setting)
  • TAR - <=30 days after index date (index date being the earlier of a clinical diagnosis or a test+)
  • O - I would propose we try to train models for O1 hospital admission and O2 death, although I suspect the number of O2 will be probably too small…

Anyone interested to contribute and join this venture? I’m “looking” at my fav prediction wizards (@RossW @Rijnbeek plus the above mentioned) but also about anybody else with knowledge and/or data potentially useful for this (@tduarte @Sara_Khalid1 @scottduvall and many others)

Join This Journey!!

(Nigel Hughes) #2

Would it be too difficult to also define and predict longer term outcomes of COVID-19 long-haulers, who suffer extended morbidity, impact on working, etc.?

(Daniel Prieto-Alhambra) #3

i think that would be relevant but probably for everybody (including those >50) as many long-haulers will have symptoms due to organ failure (eg heart failure or ckd)… so probably a different research question? i’d be delighted to be involved in exploring that but probably when we have longer follow-up in the data? beware we’re looking into this with @Juan_Banda (see preprint here)

(Nigel Hughes) #4

Okay - it appears that long-haulers have characteristic aspects of chronic post-viral syndrome, and not exclusively organ failure only, with also cases seen in younger people who may not have been hospitalised, but were symptomatic.

In any event a bit of a stretch probably.

(Daniel Prieto-Alhambra) #5

yes i think we still need to understand what long-covid really is before trying to create good phenotypes and predicting who will suffer it. it looks for now as a mixture of post-viral fatigue and some sort of chronic pain + a mixture of cardiac and respiratory symptoms

(Kristin Kostka, MPH) #6

+1 on this topic. We have so much to learn on the under 50 population.

Agreed. We struggle to get death data into the CDM even when we incentivize people to capture it. We’ll have to see what’s possible. I don’t expect death to be well captured. @esholle probably has some ideas.

There’s a whole world between Admission and Death though. Maybe there’s another O in there. That said ECMO, ventilation and lots of procedure data are getting tied up in flow sheets (aka trapped in EHRs). :thinking:

(Peter Rijnbeek) #7

Do we already have an idea how big O1 is for this group in the bigger datasets?

(Daniel Prieto-Alhambra) #8

Based on Charybdis i would expect around 5% of total diagnosed. In sidiap this is in the order of 2,500-3,000

(Benjamin Skov Kaas Hansen) #9

I’d be happy to join the work! We (still, sadly) don’t have a useful OMOP CDM with our data, so I’d be able to help out mainly with the methodological aspects of the analysis and writing.

I brought this up before (when discussing the seek cover paper), but I think we should consider af time-to-event prediction model instead of a binary one. A simple way would be to use a discrete-time survival analysis, which can be done with standard GLM, or an accelerates failure time regression model (to keep a continuous outcome variable). Another way would be to train a Hidden Markov Model with, say, a number of outcomes (could be modelled with daily steps): at general ward, at ICU, discharged, dead. At least, it would give some methodological edge to the abundance of COVID19 papers out and around.

(Daniel Prieto-Alhambra) #10

thanks ben. although i agree a time-to-event model would be useful, we are here talking of a short timeframe (up to 30d) so coming to it from a clinical epi perspective I do not see a lot of value in changing modelling strategy

I also discussed this with Prof Gary Collins (prof of prediction modelling here at oxford) and he thought a logit or similar binary model would be fine given the short timeframe

(Gregory Klebanov) #11

Hi Dani. The University of Washington (UWM) and Odysseus would be glad to join the study. We literally just finished the UWM COVID19 data ETL into OMOP, should be a good size data set.

(Daniel Prieto-Alhambra) #12

Awesome! Thanks!!

(Andrew Williams) #13

We (Tufts MC) would like to participate also.

(Daniel Prieto-Alhambra) #14

grand!!will be in touch

(Jose Posada) #15

hi @Daniel_Prieto

We would like to contribute as well.

I am tagging @jenwilson521 from our Lab which is also interested.


(Daniel Prieto-Alhambra) #16

Amazing, look fwd to it!

(Adam Black) #17

I’d like to help as well.

(Sara Khalid) #18

Hi Dani

Thanks - I am happy to join this study. Agreed on logistic regression for imminent outcomes and time-to-event for longer time frames.


(Vojtech Huser) #19

I am interested. As non-data contributor. Possibly even with N3C data (as data site).

(Fredrik) #20

I’d be interested to contribute too in terms of study design, interpretation etc.
I wonder if TAR for death should be more than 30 days (45? 60?) - my view of the disease is that it can develop and remain serious for longer, with some patients in Sweden according to reports even spending >30 days in intensive care and ventilator care.
I also agree with Kristin there may be some outcome between hospitalization and death to consider.