Sharing client components across applications

Chris_Knoll · July 5, 2017, 1:37pm

All,
There’s been recent talk about taking code that has been copied across our client apps (specifically the D3 charting library that is used in Atlas Data Sources and AchillesWeb) and pull it into a shared repository. This thread is to discuss the technical details of an approach.

Some context: we’re only talking about client side (javascript) libraries. There have been 2 recent pull requests to factor out code here and here, but the first involves the UI tier so that’s the context I’d like to focus on. We already have an example of a shared component on the Java-tier with SqlRender, but discussing this approach involves maven builds and nexus repos, so that’s out of scope for this discussion.

For sharing client-side code, I think there’s 2 main approaches for organizing the codebase: one shared repository where we sub-divide the repo into sub-directories to store each component, or one-repo-per-library.

One advantage of a dedicated repository is that we can have issues and feature requests focused on the particular library. In the case of a single repo, you’d have unrelated issues mixed in together, which may make it harder to organize bugfixes.

Also, with per-library-repos, we can release each library independently under its own release version. I’m not sure what a single repo approach would look like for releases (if we would utilize releases at all) but I think it would act like a ‘OHDSI UI package’ where all updates to client UI libraries would be released as one unit.

Another thing to consider is how the OHDSI git repo list would be impacted by this. With a single repository approach, it’s just a new OHDSI repo (“Shared UI Components”). Using a lib-per-repo approach, we have N repos for N libraries. If N gets large, it could be a headache.

The initial libraries that I can think of that would make good candidates for sharing are:
jnj_chart (which we’d rename)
facet tables
cohort editors (I have an implementation that optimizes this library into a single AMD .js file that can be imported into an application)

I think these 3 examples would make a good first step towards formalizing the javascript code sharing.

Technical Details

Regardless of which way we organize the code base (repo-per-lib or single-repo), I propose that the
top-level folder of a library contain a src, test and build folder. The src contains the sourcecode of the library, test can contain testing code, and build can contain the optimized build of the library that is referenced by the client applications. I don’t propose that we have our apps reference a raw.git URL to fetch the codebase, rather we should come up with a CDN-like approach where versions of the libraries can be pushed up and referenced via version number. Example:

// require.js configuration
"jnj_chart": "https://cdn.ohsi.org/jnj_chart/1.0.1/jnj_chart.min"

This example assumes that the file jnj_chart.min.js has been pused up to a CDN called cdn.ohdsi.org, to a folder dedicated to the library: jnj_chart, with version 1.0.1. jquery uses similar semantics, as does most libraries that are posted to CDNs such as cloudflare.

We are using Require.js as our AMD module loader, which comes with a optimizer that I’ve applied to the Cohort UI components. I’ve invoked the build directly from a command line and haven’t implemented a grunt/gulp build script for this, but there are many examples of this that we could leverage.

I do not want to turn this conversation into a require.js vs. {insert other js library framework here}. The work involved in making these components shared should simply be a matter of extracting one of the copies of the library out into a shared location, and then updating the client apps to reference the shared library. We should avoid API changes that would break existing code.

By formalizing the approach, everyone can have the same expectations about how the code should function and we’ll be less prone to tripping over each other as we build out the shared code repository. So before making any specific changes, let’s come to a consensus with what the approach will be and how it will be applied.

Frank · July 6, 2017, 3:47pm

My preference is for one repository per library.

I agree that the charting library is a good candidate as a shared library much in the same way that SqlRender has been leveraged across multiple projects.

I also agree that the initial structure should contain an src and test folder, not as certain as for the need of the build folder. Some development work I’ve seen is leveraging npm and webpack which handles the build/optimization separate from the library itself, while that is something to consider it is beyond the scope of this discussion.

I also like the idea of having a cdn and it is something we have discussed with @lee_evans. Once we have a shared component or two out there we can setup the cdn so that it can be referenced.

pavgra · July 7, 2017, 6:42am

I see reducing of repository amount as only mentioned plus of multi-library per repo organization option. But apart of sounded problems, which are not obvious versioning and issue assigning, it will complicate the process of publishing the libs as npm packages (and as for me npm/yarn is the standard de facto for modern front-end app dependency resolution).

Speaking of folder structure, it is more often used dist or lib folder name than build in front-end development. But, agreeing with @Frank and repeating previous statement, npm and webpack usage should be a final target, not manual building and storing at CDN.

Chris_Knoll · July 7, 2017, 4:28pm

I’ve done some digging on this, and there’s many different options for how the built files are stored:

jquery: has a dist folder that is referenced in the Gruntfile.js, but the contents are ignored except for a lint configuration.
bootstrap: has a build and a dist folder, nothing is ignored, and the built output is stored there.
d3 (latest version): no dist folder, but uses rollup to bundle the sub-modules, and the package.json files refers to a build/d3.js path.
react: it’s using rollup, a build folder is referenced, but it’s not clear to me (from reading the package.json) what gets installed when you do an npm install.
svgeezy: build output is put right into the root of git repo, and package.json refers to the built file directly off the root.
underscore: this lib appears to be one big file written as a module that is either stand-alone or node-aware (but not AMD aware, oddly). The ‘build’ is just calling a npm to minify and place the result output in the root of the repo. package.json refers to this file as main.

so, as you can see, a lot of variation, and it probably depends on the library. For jnj_chart and facet table, these are just single-file modules that can be referenced in a package.json “main” which will be downloaded when the npm install is run.

For more complicated modules like cohort builder, I’d probably go with a grunt/gulp to invoke the require.js optimizer because those components are AMD formatted, and use require.js configuration. Other libraries can use whatever build tool they deem appropriate (rollup, webpack, raw npm scripts, etc), but from a npm package perspective, it seems that you just need to point the package.json to the assets from the build.

Interestingly, I didn’t find webpack used in the repos of the popular javascript libraries I reviewed, but if you have specific examples of how it’s implemented, please provide them.

Unless someone can correct me, the difference between a npm-based install vs. a cdn reference is that the npm-based requires the node tools installed and the library is copied locally (and there is a little complication consuming node-based non-AMD style modules in a browser env). vs. the CDN is just a pointer to a javascript file which is downloaded at runtime (and then cached) and doesn’t require any special steps to get it installed. You can have a CDN to a library independent of a package.json file, but both could be supported. I’ve been trying to use CDNs for most of our libraries just for the sake of simplicity.

anthonysena · July 7, 2017, 6:43pm

@Chris_Knoll thanks for laying this out. I’d also vote for one repository per library and the charting library seems like a good candidate. I also like the idea of making this repo more npm “aware” as I think this would be helpful for managing dependencies (such as d3) in a single place. We should also consider publishing these package to the npm registry similar to the way that the R packages are now being pushed into CRAN.

Leaving these here for reference:

https://docs.npmjs.com/getting-started/publishing-npm-packages
https://docs.npmjs.com/misc/orgs
https://docs.npmjs.com/misc/scope

t_abdul_basser · July 11, 2017, 5:02pm

Thanks for this. One repository per library repository is fine with me.

@anthonysena I fully support making the repo npm-friendly and preparing to publish to npm registry.

jonisar · April 3, 2018, 9:26am

Hi all, I came across this discussion and thought that if it’s still relevant it could be very useful to use Bit.
It’s built exactly for that job and it will enable you to rather easily share an sync components directly between applications without having to add 3rd party libraries or maintain additional repos. You can give it a try.