I would like to share and get other’s ideas/opinion on the best practices around OMOP CDM database on-going refreshes and versioning. The refresh does impact any analysis that is currently inflight as well as on any existing stats that have been generated so far so these should help to minimize the impact on any active studies.
I see that the following has emerged as a pattern:
-
Full refresh (quarterly / bi-annual) - includes new raw data, potentially raw schema changes, ETL scripts and OMOP CDM schema upgrades, OMOP vocab. updates: create a new OMOP CDM instance.
-
Light refresh (monthly) - includes only updated OMOP vocabs and data updates but without raw data schema changes: perform refresh in existing schema.
Also, the pattern is to keep at least two instances of data - latest version and minus one. For example Q2, 2018 and Q1, 2018.
Please let me know how you tackle this problem in your organization.
@Christian_Reich, @mvanzandt - would we consider this to be the THEMIS topic?