Hey all, I think this is a really cool and important conversation! Dependency management is often overlooked. I think if we can get this right as the community grows, we will find ourselves dealing with much less growing pain!
I just wanted to add a couple of thoughts about Docker Images. I like the idea of self-contained analysis + dependencies. I worry a bit about needing to manage a store of image layers in perpetuity. Docker images, as built by Dockerfile are not reproducible. Running apt-get install
inside of a Dockerfile will get the most recent version of a package at the time of run. Building a docker image twice has the ability to create two different binary images.
Luckily, (and be prepared for me to shill this a whole lot more in the OHDSI community), there is a super cool open source tool that looks to resolve this exact issue. Nix is a âpackage managerâ specializing in âreproducible, declarative, and reliableâ software builds and deployments. It is even able to replace âDockerfilesâ in a process to create docker images in a reproducible manner.
For example, here is a docker image for the OHDSI GIS project, specified in nix: https://github.com/OHDSI/GIS/blob/master/docker/loader/default.nix. It looks a lot like a Dockerfile, but is specified based on reproducible, built-from-source, functional, declarative package definitions.
Iâve been super impressed with the flexibility of Nix and am looking to leverage it in management of dependencies for the OHDSI GIS and Vocabulary projects. I think it could also be valuable for OHDSI study packages. I hope others also see the value of reproducible, side-effect free builds, and Nix becomes a useful dependency management in the OHDSI community! To anyone who is interested in experimenting with this, please feel free to reach out, Iâm more than happy to teach what I know and collaborate on architecting a solution.