OHDSI Home | Forums | Wiki | Github

Is OMOP really a common data model?

Hello everyone,

I am new to OMOP and I have been trying to map things to OMOP. But this question continuously comes to my mind on how we are a common data model when anyone can use any concept_id for the same value. And we cannot have same vocabulary in the mapping that we are creating.

For example, If I am trying to work on type of visits, I can use the vocabulary Visit for linking to different visit types. There might be some which are not part of the same vocabulary but come under visit types. How do we combine these?

We dont even have a consistent vocabulary within our own OMOP data base. Some use SNOMED, others use LOINC.

Another example would be Gender which I am trying to map to OMOP. But I am using SNOMED as the vocabulary. If I use ICD, I would get a different concept_id. So when we are sharing data with other OMOP models how would these things match??

Nancy. OMOP is as CDM as it gets. You need to look at the standard_concept column in the concept table. If the value is not ‘S’ then you shouldn’t use it in your OMOP dataset. Non-standard concepts are there to help you map into OMOP.
If you a concept is non-standard, it should have a “Maps to” relationship in the concept_relationship table, and you should use that concept instead. There are only two standard gender concepts. Male and Female. Everything else should have a 0 (zero) in the concept_id field.

Thanks Guy. For our source system, we have 4 values for gender M /F /not known / doesnt want to specify. If I use the standard values, I would miss on adding the the other 2 options we have. Is there a way around that?

Ooh. That is a very old topic indeed. The “official” answer is:
<channelling @Christian_Reich >
Does it matter? Would any analysis of the data be impacted in anyway if you could distinguish “unknown” from “did not specify?” (or “refused to disclose” or a number of other such concepts).
If the answer is ‘no’ then you can safely put a 0 in place of gender.
If the answer is ‘yes’ (maybe you are studying how people disclose information about their gender) then the place to put a more accurate description is in the observation table. It is an unwritten rule that you can be more lenient in that table as it’s the catch-all.
</ channelling @Christian_Reich >
I assume that since most of the time this doesn’t matter, and since there is no well-accepted standard vocabulary for all the different gender possibilities, nor a clear differentiation between biological sex. sex at birth and gender, this problem hasn’t yet been solved.
You can see in this thread that you are not the first, nor going to be the last and that it is probably the intention of the community to address this but so far, no better solution was accepted.

1 Like

@Nan, welcome to the OHDSI community, and thank you for your question. When encountering challenges with mapping concepts to the OMOP Common Data Model (CDM), it’s understandable to question how the model achieves its goal of standardization. Your concerns about using different concept_ids for the same value and the lack of consistent vocabularies within the OMOP database are valid. However, the OMOP CDM is designed to address these issues through several key principles.

Firstly, regarding the use of different concept_ids for the same value, the flexibility in assigning concept_ids allows for accommodating various vocabularies and local terminologies. While it may seem counterintuitive to have multiple identifiers for the same concept, this approach allows researchers to leverage their preferred vocabularies and ensures that diverse data sources can be harmonized into a common format. The goal is not strict uniformity but rather interoperability, enabling data sharing and collaboration across different studies and institutions.

To address the challenge of combining concepts from different vocabularies, the OMOP CDM provides tools and guidelines for mapping concepts to standardized vocabularies whenever possible. While it’s true that some concepts may not have direct mappings to the same vocabulary, efforts are made to harmonize similar concepts across vocabularies. Additionally, mappings can be supplemented with additional metadata to clarify the relationship between concepts from different sources.

Regarding the inconsistency in vocabularies within the OMOP database, this reflects the diversity of real-world healthcare data. Different institutions may use different coding systems based on their preferences, available resources, or specific clinical workflows. The OMOP CDM acknowledges this reality and provides mechanisms for integrating diverse data sources while maintaining semantic interoperability.

When sharing data with other OMOP models, it’s essential to document the mappings and transformations applied to ensure transparency and reproducibility. Data partners can exchange metadata about the vocabularies used, mappings performed, and any transformations applied to facilitate cross-study comparisons and analyses. Collaboration within the OHDSI community and adherence to best practices for data standardization further support data harmonization efforts across different OMOP instances.

In conclusion, while achieving complete standardization across all aspects of healthcare data is challenging, the OMOP CDM provides a framework for harmonizing diverse data sources and promoting interoperability. By embracing flexibility, transparency, and collaboration, the OMOP community strives to overcome the complexities of real-world data and enable robust observational health research.

1 Like

Thankyou so much for your reply. It all makes sense.

Thankyou for such a detailed reply. I am glad that the OMOP community is so strong and active for newbies and silly questions as well. Being new to the OMOP world, realizing/understandng the possibilities and capabilities is important to get maximum usage of OMOP.