OHDSI Home | Forums | Wiki | Github

Race and Ethnicity in the OMOP CDM

@clairblacketer you may consider to add the NCIT resource in the conversation above.

Hi @linikujp youā€™re right, we would need to add these. Iā€™ll add that to my list.

Hi, has there been any solution to this issue? In addition to the listed ontologies shared by @Christian_Reich, there are other consortiums working on this very same issue. For example the ClinGen consortium has a specific working group Ancestry and Diversity - ClinGen | Clinical Genome Resource I am wondering if OHDSI should start one group devoted only to this issue. I think we should consider not only race and ethnicity, but also ancestry, religion, nationality, etc.

Is there a reason that race and ethnicity are in the person table and not in the observation table? I ask because: 1/ these are usually patient reported and patient may be asked on admission even if they have a record already. 2/ increasingly, people are more and more multi-racial and multi-ethnic which is obvious to anyone who has ever seen a 23 and me result. 3/ we see that a number of patients will change their ethnicity at different visits. Still exploring why that is but the going hypothesis is that multi-ethnic patients will use whatever they see as the most beneficial for them for that visit. If they get admitted through emergency, minorities might not want to be slowed by biased triage and decalre themselves as white, but when admitted to a ward directly they may choose to decalre their minority group to get more financial aid. 4/ similar but insurance fraud when the patient has no legitimate claim to the ethnicity they are stating.

If these were observations I would feel more cofortable using non-standard concepts.

Would it violate any CDM rules to leave the ethnicity and race as 0 and create observations? Instead of 0 we could also put the first observed ethnicity and race in the person table and still add the observations.

We have a client data where there are more than 1 race for patients. What we did is to put one of values into Person table and the rest race values into Observation table. This does not violate OMOP rules as there are standard observation concepts for race and ethnicity. I am listing them below:

  • 4013886 Race
  • 44803968 Ethnicity

These concepts are loaded into observation_concept_id column and the actual race / ethnicity values are put into value_as_concept_id fields in Observation table.

1 Like

@QI_omop: We need to ratify this, by the way. We need to tell @clairblacketer.

The whole thing violates the OMOP idea: Creating a standard that everybody adheres to, so that data no longer need the context in which they were generated to be correctly analyzed. This standard should be objectively defined. Race and ethnicity however are not objectively definable. Worse, they are self-assigned, which means they are not even defined within a data asset.

So, I think the solution is what you guys laid out: Standard simple self-assignment in PERSON, and details (3/7th of an Inuit) into the OBSERVATION table. We do a similar distinction between crude and detailed with Location.

Has this been established? Where can we go to learn current standard?

We have just begun to OMOP our data at my institution and the race ethnicity problem immediately confounded us. We have mapped our data (we are in the US) to the CDC standard codes. Are the CDC codes represented in the OMOP standard vocabulary? If this is not the forum for these questions please direct me to the appropriate place.

Thank you!

There isnā€™t an established convention for adding more than one race or ethnicity to the CDM.

I will mark this thread as a Themis issue and create an issue in the Themis GitHub. Stay tuned!

Edit to add link to GitHub issue.

We have a request at our institution to bring in the patients ethnic background variable (from the clarity zc_ethnic_bkgrnd table). While the concepts mostly line up with those described here. But, as described in the posts above - we too have some patients with >2 ethnicities. Should we just pick one race for the person table, and add the others to the observation table?
There is no resolution on the Themis github issue?

Also going back to this conversation as there is an interest in adding races (such as in this post). Apologies if these questions have been resolved, but:

  1. Do we have a consensus on how to store multiple races?
  2. Do we have a consensus on whether races within the context of a country should be represented as different entities or as one entity? Such as White - US, White - British, White - Australian and so on?

Iā€™m specifically interested in the latter question (Vocabularies perspective). Couldnā€™t find a convention but may be missing something.

There is an open Themis issue here, but it lacks a sponsor . As an open source community, we rely on community members to contribute and drive the evolvement of the standards, methods and research. The Themis process is found on the Themis GitHub home page here. Who would like to sponsor this topic?

t