OHDSI Home | Forums | Wiki | Github

Rare diseases / Orphan diseases

Dear all,
Does anyone know of any sites having rare disease (orphan disease) data represented in OMOP? This can either be clinical data or registry data, and may or may not involve information regarding newborn genetic screening, about “controls” (i.e., non-cases), and may focus on specific types of rare diseases, or be broadly defined. Happy to hear what already exists and what interest there is in this topic.

1 Like

Hi- I wanted to revive this topic/question. Are there any OMOP sites that have Rare Diseases mapped to standard concepts? We’ve been looking at Orphanet and see that they have mappings to ICD10 and ICD11.

1 Like

Hello, @PriyaDesai

ICD10 codes are currently mapped to standard concepts and available in Athena, so you can take any condition you want. ICD11 is not incorporated yet.

Unfortunately, I can’t contribute to the original question on the topic.

Hi @PriyaDesai and @ronaldcornet ,
there is actually an interesting initiative by Dresden University Germany, led by @Michele_Zoch .
We just recently had a discussion with the good people from orphadata and are considering ways how we can make the orphadata codes available as a source vocabulary together with its mappings to SNOMED.
If we can collect more use cases internationally we could better justify the need for this as a new Athena vocabulary. Please share yours with us and if having the orphanet codes in Athena would help you!
Thanks ~ mik

Hello @PriyaDesai and @ronaldcornet,

like Mik already mentioned we are in contact with Orphadata. They provide mappings of OrphaCodes to ICD-10-WHO, ICD-11 and also to SNOMED. A new version will be released in October 2022.
The mapping to SNOMED is a great basis to transfer OrphaCodes into ATHENA and to enable mappings to standard concepts.
Our team in Dresden is currently working on an evaluation of the mapping (provided by OrphaData) compared to other mappings (for example provided by the German Federal Institute for Drugs and Medical Devices). We then reflect the results back to the OHDSI community and esp. to the Vocabulary Work Group.

Regarding the use cases: We need the mapping to consider rare disease in OMOP in general. (Unfortunately, rare diseases cannot be adequately coded with ICD-10 (see Aymé2015). In addition, mapping to international terminologies is important in order to expand the cohort of the few medical cases through international collaboration.) Specifically, we want to map, for example, pediatric patients with Kawasaki disease and Multisystem Inflammatory Syndrome in Children and Adults (PIMS), which emerged during the COVID 19 pandemic.

Thank you @Michele_Zoch !
one question that I have… An Orphadata vocabulary would be a classic “source” vocabulary helping in your ETL process. That obviously would require that your data source actually provides those codes in relevant numbers. Can you tell us about that?
Thanks - mik

So far, there are individual efforts, especially by centers for rare diseases, which already document with OrphaCodes. In Germany, coding using OrphaCode (and the German Alpha-ID-SE) is expected to become mandatory in 2023. Therefore, the codes will then also be available in the source data.

In addition, a minimum data set for rare diseases is used in the European Union through projects such as EUCERD Joint Action, EPIRARE, and RD-Connect (see Set of Common Data Elements). This also focus on coding with OrphaCodes. This means that there are also many centers for rare diseases and other clinics that have these codes in the source data.

Update

We are currently preparing the OrphaCodes based on the community contribution guidelines.
For this purpose, we are currently mapping approx. 2,300 OrphaCodes, which are not yet available in any mapping, using Usagi.
We hope to complete this in August.

1 Like

Hello @Michele_Zoch – any luck with those orpha codes?

Thank you for asking.

We are currently in the process of comparing the different mappings to close possible gaps.
We are currently mapping the remaining Orpha codes (about 2,350) manually with the help of Usagi. Currently there are about 1,470 codes left. We hope to finish this by the end of October and then release the mapping for discussion.

1 Like

While waiting for an official release that incorporates Orphanet into Athena, my current work-around is to obtain MRCONSO.RRF from the Unified Medical Language System (UMLS), filter the “SAB” column (equivalent to the vocabulary_id column in the CONCEPTS table) on “ORPHANET”, and convert it to the same format as CONCEPTS table as follows:

OMOP CONCEPT Table Column What to use for newly added orphanet terms
concept_id “ORPHANET_” + CODE from MRCONSO
concept_name STR from MRCONSO
domain_id Condition
vocabulary_id ORPHANET
concept_class_id Disorder
standard_concept no value
concept_code CODE from MRCONSO
valid_start_date ‘2024-04-01’
valid_end_date ‘2099-12-31’
invalid_reason no value
CUI CUI from MRCONSO

I then concatenated this with my existing CONCEPTS table.

For my purposes, it was useful to keep the Concept Unique Identifier (CUI) from MRCONSO. For start date, I used the date I downloaded MRCONSO.RRF from the UMLS. The documentation here provides guidance on what each of these fields should contain.

I’ll add Orphanet as a vocabulary to our vocabulary table as follows:

OMOP VOCABULARY Table Column What to use for Orphanet
vocabulary_id ORPHANET
vocabulary_name Orphanet
vocabulary_reference UMLS Knowledge Sources: File Downloads
vocabulary_version UMLS version 2023AB released November 6, 2023
vocabulary_concept_id ORPHANET

I hope that’s helpful to someone else! If you see improvements to this, I’d love to hear about it.

Hello @Tim_McLerran and welcome to OHDSI!

Thank you for sharing this! I have a few comments on the attributes:

The concept_id for all custom concepts or concepts created by users should be > 2 billion to ensure they don’t collide with any official concept_ids produced by the OHDSI vocabulary team. You will want to update the concept_id and the vocabulary_concept_id to be integers, since this is a requirement of the CDM. And if you haven’t already, create a record in the Concept table for you vocabulary_concept_id since there is a PK-FK relationship between all * concept_id fields and the Concept table.

Instead of using the date you downloaded the information as the start date, you might want to give a default date of “01-Jan-1970”, since the code was in use before you downloaded the code set. Or if you code set has the date when the code was first put into use, use that date.

t