Where to include genomic data in our CDM


We have genomic information of patients that we want to include in our OMOP instance, but it is not clear to us where all the information is supposed to be stored. There is an OMOP Genomic extension, but we are not sure if that moved forward.

We start from an annotated VCF, with information about the variants of a patient, and we want to store in our OMOP CDM the following minimum information:

Information about the variant itself
- variant id
- chromosome, position, reference, alternative
- gene or region where it is found
- genotype

Annotation of the variant:
- clinical information (clinvar, OMIM)
- pathogenicity
- HGVS nomenclature
- effect on protein
- population frequency information in different human populations

Is there any specific consensus of how is this data supposed to be stored in OMOP?

Thanks in advance!


Hi Carlos,

Please see this post:

It is likely you will be able to learn about the current state of work on including genomic/molecular diagnostic data in OMOP by joining or following this workgroup.

Where are you currently storing these data?


@Carlos: Please come to the Oncology Genomic WG meetings (2nd and 4th Tuesdays 9-10 am EST). We can give you an overview of the conventions for representing genomic variants in OMOP.

In oncology, we treat genomic variants similar to other tumor attributes and record them as MEASUREMENT. All clinically relevant genomic variant concepts are defined through OMOP Genomic vocab, which is constructed by consolidating genomic variants from public somatic cancer variant knowledgebases.

We store the following information coming from VCF panel or HGVS formatted variant list:

  • gene name
  • genome build
  • chromosome
  • position in the chromosome
  • definition of the mutation (insertion, deletion, substitution, duplication, deletion-insertion)
  • the affected bases or amino acids
  • HGVS nomenclature
  • Synonyms (HGVS nomenclature from other genome build)

We are now working on the OMOP Genomic vocab update and a tool that converts VCF or NGVS data into OMOP concepts (KOIOS). Come and bring your use case.