OHDSI Home | Forums | Wiki | Github

Managing Database Migrations

All,
I think the time has come for us to start considering frameworks to manage our database migrations. As the WebAPI begins to support more features, there is going to be service-specific tables that are going to be required to support the functionality, and as new releases come out and features change, having something in place to handle schema updates or data migrations (in the case of re-factoring causing tables to be split or merged) between releases will save us a lot of time.

I did a little reading on this, and it seems that two potential options is Liquibase and Flyway. This article talks about one person’s experience with Flyaway (after consdiering using Liquibase):
http://www.tikalk.com/managing-your-db-changes/

Links to these tools:
http://www.liquibase.org/
http://flywaydb.org/

Would anyone consider doing some feasibility assessment on using these tools?

One thing that I noticed in common with both of these tools is that they seem to lean towards doing a migration check during application startup. I’m not such a fan of this since I was thinking that it means that the account that normally runs the application would have to have special admin rights to execute the DDL, but looking further, I think it’s possible that we could configure WebAPI to have a special ‘migration’ database connection context (with a specific user/password that has rights required to perform migrations) that gets used during the startup-migration check, and then the normal ‘restricted’ account would be used for the normal operations of the app. I think I could live with this arrangement.

On the other hand, there does seem to be some maven plugin support with these frameworks allowing you to issue a migration action outside of running the application. So that could be an option as well.

I think I’d like to have this issue resolved before we have so many tables created to support so many service operations that it will be difficult to incorporate this feature later. Up to this point, we haven’t actually needed to add new tables to the cdm schema for anything, but this will change when we introduce spring batch, and anything else that wants to store data…

-Chris

Both liquibase and flyway would work. Liquibase seems like it would be more appropriate for OHDSI, at first glance, as the schema files (changesets) are vendor-agnostic. With flyway, we would need to add vendor-specific sql in separate dirs/files. We could follow the approach at the following link (flyway.locations=common-dir/oracle-dir) but I have to think there would be very minimal commonalities of the DDL between vendors. Are we ok with creating sql for all “supported” DBs and then let the user craft “unsupported” files?

I’m on board with a separate DataSource to execute execute DB migrations/inits if that’s the way we want to go.

Also, can we confirm that we are only talking about the ohdsi/results schema and that we do NOT plan to migrate/maintain/version CDM schema. I thought this was a given but am not sure given the last paragraph of your comment Chris. A point of clarification in that paragraph regarding the Spring Batch tables, is that the intent is to create the respective tables in the ohdsi/results schema.

Other options…

We could add OHDSI template sql in a file and use Spring to facilitate initializing a database. However, this doesn’t help with the versioning or upgrades. We might be able to extend Flyway in such a way that uses OHDSI template sql…

Ofbiz entityengine - comparable to liquibase in that a table is represented as an entity with vendor-specific sql generated. Varies slightly in that the entityengine.xml keeps track of a current schema and when run against a database, the changesets are automatically applied. Been a while since I used this (back in 2001). Ofbiz is now an Apache project. Atlassian tools use this to migrate their databases during upgrades.

I have a good test bed to pursue any of these options and add to the current Spring Batch (Job Server) pull request if desired…

Hi, Alex,
Yes, I’m only talking about tables that we create to support service operations in the results schema. I misspoke when i said cdm schema, sorry. I’m not in love with the syntax that appears in liquibase because the agnostic nature of it makes it a bit more cumbersome to construct, and I like the idea of custom java-implemented handlers that can run to do data migration (not sure if liquibase has that sorta thing). It would be a HUGE help if you did a little real-world test on that, and i think the spring batch tables would be a perfect thing to try it on. I know Hermes needs persistant storage for cocept sets, and hereclese will most likely need it as well. if we ever want to support user profiles in OHDSI apps, having tables that support that will be needed. I just forsee a long road of incremental additions to the results schema and figuring out how to manage it now … would be good.

-Chris

t