Incorporating omics into the OMOP CDM

Hi all, I’m working with a group that wants to extend the OMOP CDM to include omics (primarily genetics, transcriptomics, and proteomics) by developing custom concepts linked to the measurement table. Does anyone know if others are working on the same?

Hi @Mette_Peters!
The way OMOP solves it is the closed-world model with pre-coordinated entities, called OMOP Genomic. Currently, it includes somatic mutations and only relevant to cancer. Very limeted proteomics and no transcriptomics. But, the advantage is that you don’t need to connect anything, just record the concept. The OMOP model and methods do the rest for you.

Oncology WG (@agolozar) is in charge of it, and they have a dedicated meeting.
Other teams are working towards extension (A-STAR in Singapore)

Welcome to the community, Mette!

If only you had posted this yesterday, you could have joined us at the -omics subgroup meeting earlier today :smile:. At the Oncology WG, we have a dedicated subgroup focused on bringing -omics data into OMOP. We have developed conventions and standard vocabularies (i.e, OMOP Genomics) to represent variation at the gene, transcript, and protein levels. Most of the work so far has focused on somatic variation, but the underlying principles are the same, and there are ongoing discussions with folks interested in non cancer use cases on how to align and organize things without reinventing the wheel again. We are also in the early stages of kicking off an update to the OMOP Genomics vocabulary, so this is very timely.

Please join us. We would love to hear your wish list and plan together. We meet on the 2nd and 4th Tuesdays of the month from 9 to 10 am EST.

1 Like

Thank you for the quick and comprehensive feedback! I should have provided a few more details. Our primary need is adding assay metadata, such as assay type, platform, etc linked to the specimen that assay was done on and the resulting assay datafiles that may include variant calls, gene expression counts, or raw sequencing data.

Thanks for the additional detail, @Mette_Peters!

Please come to the WG meetings so we can discuss this in more detail. We are looking specifically at gaps in OMOP Genomics that block or limit analytic use cases, with the goal of prioritizing them for the next release. We would love to hear more about your use cases and discuss how they could be addressed in the upcoming update.

Hi @Mette_Peters . It’s great to know that you are working on this. We have been working on this for the past one year. For the transcriptomics and proteomics, we introduced custom vocabularies which include HGNC, NCBI RefSeq, ENSEMBL GENCODE and UniProt. We would be happy to discuss with you to see how we can move forward on this.