Let's talk about Persistance

Chris_Knoll · February 24, 2015, 8:48pm

The exception chain was:
javax.servlet.ServletException:
Caused by: javax.persistence.PersistenceException:
Caused by: org.hibernate.exception.ConstraintViolationException: could not execute statement
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Cannot insert the value NULL into column ‘ID’, table ‘CDM_TRUVEN_CCAE_6k_V5_OHDSI.dbo.EXAMPLEAPP_WIDGET’; column does not allow nulls. INSERT fails.

Re: flyway logs. I just pushed a sql server specific migration update (version 1.0.0.2.1) to set the identity column (while persisting any existing data) and to not do anything at all if the column is already identity. Also droped the HIBERNATE_SEQUENCE table because I don’t think we need it (and only try to drop it if it exists).

As far as subsequent flyway migrations, I’m wondering if makes sense to approach database updates the same way as feature branches (assuming your feature requires db changes):

Make your feature branch.
Make a pull request out of the new branch. Put in the comments what the expected database changes are, and ‘claim’ a shema version number (I’d vote for a numbering scheme {major}.{minor}.{release}.{feature_id}. Include int his pull request a blank file with the associated schema revision number. (this is so that other devs working on other feature branches who also need schema changes can make a migration file name that doesn’t conflict)
Start working on the feature, as more and more schema updates are made, the developer with the feature branch will need to remember to delete the schema_version row for this feature so that all the database changes are up to date between pulls. Make the migration script idempotent so that you don’t have to rollback all the current feature branch schema changes just to ‘ammend’ the new ones.
Once the feature branch is complete, and it gets merged in, the single feature ddl script will be folded into master, and other devs will only see 1 schema migration update for the new feature. I think this is better than having multiple files.

So, I sorta demonstrated this with my most recent checkin (introduced a new migration version 1.0.0.2.1 to address this update and made it re-runnable). THere are 2 units of work in this 1 migration script: migrate the ID column to identity(1,1) and drop the hibernte_sequence table. For feature branches like the cohort definition tables that me and Charity will work on, i’ll put that under 1.0.0.2.2 (or 1.0.0.3.1 of we think this is a different release…version numbering schemes are very stylistic…)

One last thing: I also enabled ‘out of order’ migrations to be ready for when feature branches do come in at any order, we will be able to apply the migration scripts.

Here’s the article I read about this technique:

-Chris

alfranke · February 24, 2015, 9:27pm

I wasn’t thinking you would add an alter script to fix the bad column and drop the table since I thought you might be the only ones using sql server and could just “repair” your “failed” migration - but it brings up an interesting topic… and with that said, probably the right call and a good habit to get into.

Not sure I grasp yet your comment about the ‘blank file with the associated schema revision number’.

Interesting article. Not sure I understand the use case to support re-running the migration scripts and making them idempotent. I thought the idea is to run once and only once (rollback manually if necessary and reapply). This seems like unnecessary overhead and less concise, but that’s my opinion…

I understand why you added the ‘out of order’ flag, but thought I would refer readers to this page so that it is understood the consequences of running with ‘out of order’ support.

Chris_Knoll · February 24, 2015, 9:56pm

Ok, I’ll read it. thanks!

The use case is if you have a complicated feature build that adds say 3 tables, alters 2 columns in some other tables, and adds a couple indexes, let’s say everything gets done perfectly except 1 stupid typo in the last index statement. Then, what you have to do in order to get your 3 tables, 2 colums and 1 index to ‘re run’ from migration is you have to drop 3 tables, un-alter 2 columns and drop 2 indexes.

Instead, if each part of the migration said ‘if table does not exist, create’… and ‘if column isn’t X’…alter. If index doesn’t exist…create, then you wouldn’t have to unroll the other 7 changes just to fix that last one.

I’m speaking from very recent experience trying to get the identity column to convert without loosing data in the table in my last push: without making it idempotent, i had to open sql management stdio, alter the table back to it’s origional form, then delete the schema record to have the migration apply again. Then something else I messed up (but it did get tot he part about altering the table), so to ‘try again’ i have to unroll the amount of the migration that did work so that I could try to re-apply again. It’s just an idea, if it doesn’t seem helpful no one is forcing anyone to write the migration scripts that way. Just a suggestion…

Out of order isn’t without it’s pitfalls. I understand that if order of migration matters you could wind up in a world of hurt. That’s just something that we’d all need to be aware of. Or we just disable out of order migrations and everyone needs to work within a single ‘database migration branch’ so that everyone is synced to the same order.

Re: blank file: that was just a way to put in a ‘place holder’ so that people know what feature branch maps to which migration file. If they accidnetly used the same file, you’d hae a merge and I don’t know what that would do (maybe it would be fine, both migrations would get smooshed together, but I dont’ trust the merging algorithm to understand that the files merged together should be considered ‘appended’ to each other and not intermingled.

-Chris

Chris_Knoll · February 24, 2015, 10:05pm

mmmmhmmmmm … this is me rubbing my temples. Why did you have to show me that article?

So, they make a good point. perhaps the answer is that we specify out of order = FALSE as the default but in hotfix deploys you specify outOfOrder = TRUE. I believe that if flyway sees a new migration script but the version number is behind the ‘latest’ version of the schema version, it ignores it.

I’ll do some experiments with it and see what I come up with.

daniellameeker · May 15, 2015, 3:05pm

Entity Framework is going to become part of PCORnet architecture. Just an FYI.