I think the rule in vocab 4.5 and beyond, that the concept domain type should guide what OMOP table the medical event is stored in is a game changer for those writing ETL code. I plan to change my approach. I am thinking of creating a ‘mash-up’ table that combines attributes from the Condition, Procedure, Drug and Observation tables and ETL’ing source data into this ‘mash-up’ table. Then writing views on top of the ‘mash-up’ to express the various OMOP tables based on the concept domain.
My question is, “is it worth it for the community to spec out this mash-up table?” As a developer
I typically loath it when the specification tries to force an implementation.
However I see and upside if there is an agreed upon ‘mash-up’ table:
- ETL definition will be simpler because instead of documenting how an
event from the source diagnosis table can be mapped into all the
possible tables based on the domain, it will only have to define the
rules into the ‘mash-up’ table. - Greater consistency in the ETL implementation will likely lead to
more consistent results in the OMOP tables. - We can develop one set of views to convert the mash-up into the OMOP
representation. - I’ve seen two rule base systems that attempt to make it easier to develop ETLs into OMOP, that will now not function. Instead of defining a rule from the source diagnostic table to the Condition Occurrence table, they will now need to accept rules into multiple tables and then to pick which rule to use based upon the concept domain.
Downside:
-
Another community documentation effort.
-
Less direct understanding of source to final destination tables in
OMOP CDM -
Standard will be too late for those already developing ETLs based on
vocab version 4.5 and beyond.
I doubt that I am the only one to think about implementing ETL’s in this manner. Looking for comments on this approach if someone has already tried it, and if successful, sharing of their ‘mash-up’ table definition, maybe as a de-facto standard.