File based OMOP implementation in multinational PASS study

petjensen · October 3, 2018, 1:06pm

Hi
I am relatively new to the CDM domain in general and OMOP in particular. I am involved in setting up a Post Authorization Safety Study covering a handful of european countries. We will be looking at specific drug exposures and outcomes. Data sources from each country vary a lot in both number, structure, coding and internal linkage. A common data model is definately needed, and OMOP may be a great choice. Due to legal restrictions, analysis for some countries will have to be analyzed locally, while other countries data can be gathered centrally. In either case, analytics infrastructures in most places dictate that data storage, representation and analytics will have to be based on datafiles, rather than a database implementation.

So while it is probably possible to do a file based transformation into the OMOP CDM table structure, I am uncertain of how much benefit I will get from it. I am especially thinking of the deep integrations with the OMOP vocabulary. My alternative would be to use the OMOP table definitions and relations as a guideline to a simpler and more study focused CDM model.

Any experience or thoughts you can share are appreciated. Including experience with estimating the amount of work involved. Transformation will be performed in collaboration with the local data experts and some degree of local datamanagement expertise

Regards
Peter Bjødstrup Jensen
University of Southern Denmark

mkwong · October 3, 2018, 4:49pm

Hi Peter - many years ago I did a proof of concept exercise in transforming a complex database model to XML for a web-based application to avoid the construction/decomposition of relational database tables as data is added/deleted/updated. It took about 3 days to write the XML schema (based on the database tables) and 1 week to write the database-to/from-XML. So from my experience it isn’t a huge undertaking to support both a database CDM and its equivalent - even if the concepts used is expanded in XML to include both the concept_id and concept_name. Everything was written in Java using standard XML libraries and JDBC on the database side.

petjensen · October 4, 2018, 12:25pm

I was not even planning to create an XML representation, but simply map my source data into a file based representation of each the OMOP datatables. Those files would e.g. be Stata files or similar. But would I e.g. be able to leverage the power of the vocabulary? I guess the vocabulary could be represented in a similar way, allowing me to write datamanagement scripts and analytical scripts on top of it all

chifundok · January 23, 2021, 11:38am

Hi @petjensen, wanted to check what you ended up doing in your situation, we are facing similar situation in our project, we are integrating population health surveys data in OMOP. We want ot be able to dynamically give flat files to some of our users who are not comfortable with databases