ETL strategies for multiple OMOP CDM versions

lrasmussen · January 29, 2019, 10:26pm

I was wondering if groups had any recommendations/best practices/warnings for managing ETL from a canonical EDW to multiple OMOP CDM endpoints, where the version of the OMOP CDM can change (meaning, assume some structural difference).

Is it better/easier to manage multiple versions of the ETL that are tied to the version of the CDM (possible duplication of ETL code), or to create a canonical instance of the CDM that is able to feed the different versions of the CDM (possible increased complexity)?

Many thanks for any pointers.

DTorok · January 29, 2019, 10:41pm

I would opt for creating a CDM that is able to feed the different versions, except I would code the most complex model with the assumption that the most complex model has all the information needed to feed the other CDMs. For example if you need to support v5.1 and 5.3 I would code for v 5.3 and I bet you could then put views on top of the 5.3 CDM to resemble v 5.1.

MPhilofsky · January 30, 2019, 6:36pm

Colorado writes one ETL for the most complex version (full PHI) and then adds views on top or ETLs to other models.