OHDSI Home | Forums | Wiki | Github

How to give concept_ids to a new vocabulary?


We are several institutions in Finland working together to ETL our databases into the OMOP-CDM.
We are creating (and mapping) local vocabularies that eventually we would like to add to Athena.

However, it may take some time to arrive to consensus on these vocabularies, and we would like to use them already to compare our databases not only at the concept_id level but also at the source_concept_id level.

Is it, for example, possible to reserve an unused range of concept_ids ? this way when our vocabs are ready we can add them to Athena without having to change them/

or how do you handel this situations, im sure we are not the first ones with this question

We haven’t done that, @Javier. Right now, a vocabulary is either out (no concept_ids) or in (concept_ids forever). If it is in we will maintain it, with your help of course. We could just take your vocabs without releasing to Athena, but we have no way of releasing concept_ids.

You have the 2 billionaires: concept_id above 2B is available to you, but only locally. Why don’t you do your comparisons in that space, and when you are ready you get new concept_ids. The source_values (codes) are still there.


Thaks @Christian_Reich,
Using the over 2B concept_id seems the best option.

I actually notice that is how others are doing (@MPhilofsky and @ufuoma ) Implementing & adopting a customized OMOP Common Data Model – OHDSI

May be we dont event need to add them to Athena and mantain them locally.

1 Like

Hello @Javier and welcome to the journey!

My poster from last year, Mapping Custom Source Codes to Standard Concepts: A Comparison of Two Approaches, might also be of interest to you. It discusses the 2 different methods for mapping local, custom codes to standard concept_ids.

I’d also like to invite you to the EHR working group meeting. Where we discuss all things EHR, OHDSI, OMOP, healthcare and data conversions. To join any OHDSI working group, please follow these steps:

  1. Sign up for a MS Teams account here
  2. Pick working groups to join here
  3. This web page shows all the upcoming working group meetings

thanks so much,
We have been doint the second approach the whole time (+ adding info to CONCEP_CLASS and VOCABULARY tables).
Nice to see that others have the same approach and that we were not so crazy on this :smile: .

1 Like

You’re welcome!


Thank you @Javier for bringing this topic to discussion and thank you, everyone, for the useful information in every reply.

We are currently working on a retrospective oncology research project with several hospitals in Europe. In an initial phase, all the information filled in eCRFs will be saved within each hospital using OMOP-CDM.

We were able to find standard concepts for almost every entry, but we have a need to add custom concepts for some isolated missing terms. Our questions are:

1- Regarding the 2 billionaires approach, can we assign the concept a domain and save it within the table we found pertinent, or should we save this somewhere else like the NOTE table?

2 - As most of the concepts we need to include are already defined according to international guidelines (for example PI-RADS categories for prostate cancer), we were wondering if 2 billionaires would still be a good way for us to proceed or which would be the process to follow in order to propose new terms to oncology-related vocabularies. If this is a possibility I understand it might take some time, could we still do both things in parallel?

Thank you all in advance!


If these are part of a maintained or semi-maintained vocabulary, you can put a request in with the OHDSI vocabulary team on GitHub. They will let you know if adding them to the OHDSI vocabularies and Athena is feasible.

If the vocabulary team doesn’t add them, then my 2020 Symposium poster explains the two different approaches to custom mapping. This will also explain your question #1.

More details in my poster from the 2021 Symposium about customizing your CDM.

I think this is possible if you know the vocabulary_id when writing the ETL SQL. Or you could update the code when the Vocabulary is added to Athena. The second option is probably the best option. The OMOP CDM will need updates and other maintenance activities anyways.

Thank you so much for your answer, this was really helpful.

1 Like