I’m getting an error when installing CohortMethod telling me that package ‘CohortMethod’ is not available for R version 3.5.1. Pease advise.
As @msuchard suggested here, I think docker images can facilitate reproducibility of OHDSI research and help OHDSI researchers to install OHDSI R packages on their computers (Actually if they use docker image, they don’t need to install it by themselves).
How do you think @schuemie?
I built a toy example of docker image for CohortMethod version 3.0.2 based on rocker/verse:3.5.1 (based on R version 3.5.1)
You can see the docker file here (Though it’s not tidy, the code is very simple)
I checked that cohort method can be loaded in this image. But I need to test that this can run actual analysis.
Anyway, you can run the docker image by the code below
$docker run --name cohortmethod -e USER=user -e PASSWORD=password1 -d -p 8787:8787 chandryou/cohortmethod:3.0.2
After running the image, you can activate rstudio on your brower
please replace 22.214.171.124 with your ip address.
After using Rstudio, you can stop the docker image
$docker stop cohortmethod
Comparative Effectiveness Study of Febuxostat versus Allopurinol in Gout
I’m sorry. I should’ve mentioned that we’ve already have great docker image for OHDSI R Method Library.
Are those docker images versioned?
@schuemie which docker images do you mean?
As long as I know, broadsea method library docker images are not versioned.
I plan to make a series of cohortmethod docker images compatible with each version.
And I’ll make docker images for OHDSI studies based on the versioned cohortmethod docker image or PLP docker image.
We are operating in a completely locked down user environment at my institution and do not have docker or the ability to communicate externally with a code repository like git. Can someone please share just the R code that was used to replicate the Graham Study in the online tutorial.
Hi @BrettRSouth. I’m afraid it is not that simple. The Graham study R script has dependencies on R packages both in CRAN and on Git. If you can’t install those it won’t work.
Also, perhaps a simpler first step would be to reproduce the study in the CohortMethod vignette (which unfortunately still has the same dependencies).
@BrettRSouth In that case, the easiest way to install the OHDSI package might be:
- install R in personal computer with the same OS (Windows, Mac, or Linux) system of the server. The R version should be same as your server’s
- Install R package in the personal computer. And then copy the library (If you use Windows, it’s in the my document->R folder).
This is how we’re doing for the completely off-line server in the remote institution.
Thank you @t_abdul_basser
With help of @NEONKID, I made docker images of ‘cohortmethod’ and ‘patientlevelprediction’ CPU version tidier.
In PLP docker image, I installed python 3, tensorflow, keras, and torch.
Again, you can run the docker image for cohortmethod and plp package like below
$docker run -e USER=user -e PASSWORD=password1 -d -p 8787:8787 chandryou/cohortmethod:3.0.2 $docker run -e USER=user -e PASSWORD=password1 -d -p 8787:8787 chandryou/patientlevelprediction:3.0.0
I checked that I can run a study package based on CohortMethod package in docker image.
I checked that I can run keras package in plp docker image.
I’m struggling against issues with OhdsiRTools in plp docker images, now.
I’ve never used Docker before, so I tried to install it. Since I don’t have Windows 10 (apparently a requirement for the Windows Docker Desktop), I installed the ‘Docker Toolbox’. After installation, I get a “Looks like something went wrong in step ‘looking for vboxmanage.exe’” . Googling on that message hints at problems with environmental variables.
So my first impression is that Docker makes installation harder, not easier
Of course, it would be harder for you, @schuemie . Because it’s not hard for you to install CohortMethod by yourself. Using docker image is cumbersome for me, too, because I have OHDSI Tool environment in the server.
I should admit that I am not know docker can be ultimate solution for the problems with installation of OHDSI tools, either.
Still, another reason I insist on Docker, is reproducibility.
Recently I’ve tried to run Coxib vs NSAIDs package. And I failed to run this package even though I installed almost all the packages again as described in the github. Besides the OHDSI Toosl, other packages and R itself evolves. I think One docker image per each OHDSI study can be better way for reproducibility.
As always, I don’t know much about computer science and docker. So I want to hear others’ opinion. Thank you @schuemie for your comment!
Well, the Coxib vs NSAIDs package is truly ancient, so I’m not surprised there are problems there.
Just so you know, I’m also experimenting with this new function in OhdsiRTools. You can point it at an OHDSI GitHub repo of a study that has the environment snapshot stored in its
inst/settings folder (using the
insertEnvironmentSnapshotInPackage function). It will automatically install all the required versions specified in the snapshot, both those from Github and CRAN. For example:
I also realize there’s no central spot with instructions on how to set up R to run the OHDSI packages. I’ve created a first version here. Hopefully this will grow over time, and it something we can reference from the various packages.
The OHDSIRTools plp issue about logging has been solved in the efficient branch on github.
Now PLP is mentioned, PLP has more complicated environment because of python and GPU. Again, I believe docker can support PLP, too.
late to this discussion but here…
The ARACHNE Execution Engine (EE) was created with an idea that someone needs to execute the R package (or other code) in clean and often disconnected from the internet environment, where dependencies cannot be easily downloaded (typical in many Enterprise environments where internet traffic is blocked). The EE itself is Docker-based and deployed as a part of the ARACHNE on each data node participating in the network. Inside, it has a copy of all core OHDSI libraries required to execute OHDSI R package. Everytime a new request is sent to execute the R code, it creates a new clean copy of R environment, executes the R code and grabs and send execution results back. The EE supports the PLP and PLE methods.
Right now the ARACHNE EE component does have an API (used by ARACHNE Data Node) that can be used to submit the code and receive results back. But in the upcoming future release we will be adding a little UI so that folks can simply upload R code and receive results back. We are also planning to introduce the versioning on OHDSI libraries sometime later this year.
Another thing that we have had a good success in is packrat (the original idea came from @schuemie actually) . Using packrat would allow to create a self-contained R package - including ALL required dependencies - so that folks in internet-shielded environments can still execute without a need to be connected internet to download and resolve dependencies. I would propose that we ask all OHDSI studies to publish a packrat package (versioned) bundle so that it can be executed not only everywhere but anytime in the future.
Would be happy to connect to our development team if someone wants more details.
It is really Awesome! @gregk .
I’m working inside Korean CMS (HIRA) which has internet-shielded environment. I didn’t know that packrat can be a solution.
If R version itself is upgraded after one or two years, then packrat can make a same environment, too?
well, it is a good question. In my mind, I always try to build a parallel to the Java world which is a bit more familiar to me and advanced when it comes to packaging and distribution:
- Java run-time -> R Engine
- Java code (logic) -> R Code
- Jar/War file -> Packrat (packaging code and all private dependencies)
Similarly to Java (or other languages), R code would have dependencies on R engine and system libraries that come with it. So, if R engine version is upgraded but all functions that are used in R code still there - the code works. However, if someone would try to run R code on a newer version of R engine and something was deprecated - that would fail.
But this problem is easy to fix since R distributions are well versioned and you can always get the older version (or the right version) of R. For example https://ftp.harukasan.org/CRAN/