OHDSI Home | Forums | Wiki | Github

EHR Dream Challenge: Patient Mortality


(Kristin Kostka, MPH) #21

Hi All - there’s a team page now: https://www.synapse.org/#!Team:3397227

If you want to be affiliated, please sign-up. Here’s the instructions: https://www.synapse.org/#!Synapse:syn18405991/wiki/595485

(Alexander Sivura) #22

Hi everyone. I have some time this weekend and I’m thinking to establish infrastructure to make submissions for the challenge. Are we going to keep all code open during the challenge or we’ll publish it after? Do we have a GitHub repo where we are going to keep our code? I can create it.

(Rohit Vashisht) #23

Hi @alexander
Thank you.
It’s best practice to keep everything open!

You can go ahead and creat a Git repo as well. Once it is all finalized then I think we can integrate it with OHDSI sandbox.

I was going through the docker tutorial as well. What I could gather so far is the following:
a) Creat a docker yml compose file.
b) The file should be able to pull the Postgres base image (available on Docker Hub).
c) Mount the local data directory containing synthetic data.
d) creat an in memory database in the container.
e) make the database talk to PLP package.

The whole docker yml file then can be compiled into an image that can be submitted.

Not very sure how best to do the step (e). Still reading.


  • Rohit

(Alexander Sivura) #24

I see. My GitHub email is asivura@gmail.com . Can you create the repo and invite me there, please?

(Rohit Vashisht) #25

Here is the GitHub for trial and error https://github.com/rohit43/ohdsiDream

Nothing there yet.
I’ll send the invite to all who entered (and will enter) their emails in the google doc.


  • Rohit

(Kristin Kostka, MPH) #26

@alexander saw an email you made a submission. Please be mindful about clearing this with the rest of the team before entering a formal submission. I can forward you the failed workflow notification.

(Alexander Sivura) #27

@krfeeney Yeah, thanks for sharing. I got that email as well. I’m trying to figure out with their infrastructure and get submission results for dummy model. I’ll make github pull request today or tomorrow for it.

(Alexander Sivura) #28

Hi there. I spend some time to figure out with challenge requirements for wrapping code into a docker container.

Please, check my code in this branch https://github.com/rohit43/ohdsiDream/tree/dummy-docker-container . I created PR and hope it will be merged into the master branch. This code just creates predictions.csv file with 0.5 score for each patient. The most important thing to make sure we do everything correctly according to their requirements

So they will not give us access to real data. We’ll have to wrap our code for training and evaluating our model into a docker container and submit it for execution on their server. We can’t even see log messages what our code generates when it will be running on their server.

I don’t think our code will be able to connect any external Internet resources. If we are going to deal with Postgres database we’ll have to create every time from csv files.

For debug purposes, they recommend using synthetic data they provide. Real-world data will be in the same format.

Feel free to reproduce my code on your machines and let me know if something goes wrong or you have any questions. I added some instruction to README.md

Kind regards

(Alexander Sivura) #29

I was wrong. They give you access to your scripts logs after running docker in their environment.

(Alexander Sivura) #30

I spent some time yesterday reading documentation of PLP package. This tool is amazing!

(Chungsoo Kim) #31

Thank you @alexander,
I read your script on GitHub, it’s great. I think it’s a good start. I have successfully installed the docker.

I also can make a PLP model. I think we should also consider how to build a model that is not overfitted in a specific database. (Because mortality is a very unbalanced outcome.)

@krfeeney I think we need a proper distribution of roles for each person.:grinning:

(Alexander Sivura) #32

I realized that PLP already had been wrapped into Docker file by Seng Chan You https://github.com/ABMI/ohdsi-docker/tree/master/patientlevelprediction_cpu/latest. We can reuse a docker file from his repo.

Thanks @SCYou for that.

Here is my PR https://github.com/rohit43/ohdsiDream/pull/5

(Seng Chan You) #33

You’re welcome @alexander :slight_smile:

(Rohit Vashisht) #34

I could compile the image without errors! In addition, added pull image for the postgres in the Dockerfile.
Haven’t tried connecting and building the database from .csv files yet.

Idea would be to creat an ‘in memory’ i.e in containor CDMV5 or above and then connect PLP to it.


(Alexander Sivura) #35

Hi Rohit.

I guess you broke the build with your latest changes. May I ask you to revert it, please? You can’t inherit the docker image from two different places at the same time. It’s not going to work in this way…

Currently, we use rocker/rstudio:latest as a base Docker image.
Rstudio image bases on rocker/r-ver
r-ver image bases on debian:buster

To add Postgeress to docker file better to figure out how to install Postgres on Debian and add those commands to our Dockerfile.

I believe it will be much easier than using Postgres as a base image and add all dependencies of RStudio and PLP there.

Kind regards,

(Rohit Vashisht) #36

Hi Alex,

Did you built the new image? I didn’t get any errors in the build when included Postgres. What I could gather was that we can use R as base and also pull Postgres or vice versa. Not sure what am I missing.
Should I go ahead and use Postgres as base and R on top of that?


  • Rohit

(Alexander Sivura) #37

I see. Let me read more about it. I’ve never seen more then one FROM commands in Docker files before.

(Isaiah Nyabuto) #38

Hi @alexander, I am following this closely though I am abit lost, which resources are you looking at? Where can I find info. about the PLP, any reason why we are setting this up? Thanks

(Alexander Sivura) #39

Hi @INyabuto. Using PLP is not my plan, but I like it. Please, check the Rohit’s post from September 27th in this topic.

You can find more information about PLP here https://github.com/OHDSI/PatientLevelPrediction

(Alexander Sivura) #40

It seems like it is possible to use FROM command in the docker file more than once https://docs.docker.com/develop/develop-images/multistage-build/ . But I’ve never did it before :slight_smile:

I tried to build a docker container. It works!

So we have everything set up now to write some code for creating Postgres database instance in memory, uploading csv files there and using PLP against it.