Exploring the Strengths and Limitations of OMOP CDM

Just another beer debate between OHDSI colleagues

Argument: Standardize all health care data to OMOP

  • OMOP CDM can standardize a wide range of longitudinal health event data at the person level, with few exceptions. Exceptions include: 1. Data with intraday granularity., 2. Data involving terminology not covered by OMOP standard vocabularies, like images and genomics., 3. Datasets heavy on measurement values pose challenges due to limited experience with standard tools.
  • Standardizing data requires significant expertise; errors in standardization can lead to non-representative data, impacting analysis quality.

Counterargument: Standardization should be done to support generalizable analytic tools

  • Data standardization, such as with OMOP CDM, serves specific purposes like leveraging standardized analysis tools or ensuring consistent application of analytic codes across databases.
  • OMOP CDM can handle datetime granularity, but the absence of standardized tools for optional datetime fields necessitates custom analytics. Hence CDM 6.0 is not adopted.
  • OMOP CDM supports various data domains and has extensions for imaging; challenges arise from mapping when source vocabularies are not in OHDSI standardized vocabularies.
  • OMOP CDM effectively models measurement values, and tools like ATLAS can use these values; however, standardization issues persist across concepts, units, and values_as_concepts.
  • OMOP CDM also models procedures well, though lack of a procedure hierarchy often requires intensive curation and context-specific code lists.

Considering he history of the CDM, its original design was done prior to some major advancements in databases.

  • Increase in variety of data types
  • Horizontal over vertical scaling
  • Polyglot persistence to support more data domains and types of analysis - where relational, graph, nosql databases are all used depending on their strengths for analysis.
  • Improved privacy techniques like differential privacy.

It would be worthwhile to examine how these advancements would address some of these points in any redesign.

Also, this is an older paper, but I think the points are still valid today regarding the need for alignment to ontologies to improve analytic tools.

Ceusters, Werner, and Jonathan Blaisure. “A Realism-Based View on Counts in OMOP’s Common Data Model.” In pHealth, pp. 55-62. 2017. (link)

@Gowtham_Rao:

What’s the discrepancy? Why not “Standardize all health care data to OMOP to support generalizable analytical tools”?

I like that idea. That means you are not done with your ETL if you put the data in OMoP. You are only done when Atlas (for example) fully understands (conformity of vocabulary and ETL conventions) and is able to use all the data intended to be analyzed (complete)