OHDSI Home | Forums | Wiki | Github

Concept ids for proteomics, lipidomics and other x-omics measurements

Hello OHDSI,

At SIB we actively use OMOP for multiple projects.

Current assortment of the official concept ids served via Athena is well suited to register what is in current clinical routine, e.g. biochemical measurements. We need to go further and register massive results of proteomics, lipidomics and other x-omics measurements. Does Athena have an official vocabulary for those?

If not, we can obviously generate new concept ids. We thought to make an id per each combination of measurement type [e.g. proteomics] and particular target [e.g. albumin, Uniprot id P02768]. If we go this way, should we constrain new concept ids to a certain range of integers to make sure these new ids never duplicate Athena official ids?

thanks for your answers

@Alexdavv Possible it may be a new vocabulary like OMOP extension with 2billion concepts creation?


Thanks for reaching out.

We actually already started on that path, with variants (genomic, transcript and protein). Happy to add the ones you mention, but do you have an analytic use case in which they would be used?

Concept IDs: We could reserve a block for you, if that makes sense. Alternatively, you submit the concepts and we use the standard machine to assign the IDs. Makes no difference, the IDs have no meaning.

1 Like

Hello @Christian_Reich,

Glad to know you have recognized and addressed this already.

it certainly of value nowadays to analyse entirety of data generated using various techniques, with the ones you mentioned included. Metabolomics [and lipidomics in particular] is key approach to see how metabolic pathways are affected by treatment, in diabetes in particular. These assays is basically just an extension of traditional analysis of blood and urine samples routinely done in clinics. In our current project SOPHIA we have datasets with results of multiple assays done on same subjects, and we are going put them in OMOP, then analyse.

For concept ids, the only concern obviously is that they remain unique and do not conflict with existing nor forthcoming ones. The final goal is that measured values [in various datasets from different origin] with same concept id are all directly comparable. Problem is of course units, that might be totally different for different measurement techniques.

Practically I would be glad to know if you reserve and publicly declare a block of integers that OHDSI will never use, so that developers might use them for own concepts. Same idea as the famous range of IP addresses dedicated for public use for local subnets.


@Christian_Reich Is there a source of OHDSI mapping of genomics variants to phenotype data?

Oh, those exist. They are above 2,000,000,000. So-called “2 Billionaires”. All yours.

You mean, like in OMIM? There isn’t. Those would be the result of OHDSI research, not the input. We are trying to stay away from becoming a knowledgebase.

Having said that, we do have some of these, for example indication for drug, or anatomical focus of procedure, when somebody gives them to us (e.g. NDF-RT or SNOMED). But we neither have the resources nor the need to go into that direction for our use cases.

@Christian_Reich No I meant if there’s an OMOP vocabulary for genomics data.
I see some work has already been done: