I have been exploring the concept_relationship table to find mappings of the NAACCR Race1 values to standard OMOP Race concepts. For example:
OMOP Race: 8527 - White to NACCR Meas Value: 35911604 - White or 40198884 - White.
I assume I simply need a point to the relevant documentation on this topic.
Thank you @DTorok for confirming that there is no mapping. I looked at the reference documentation but could not find any rationale as of why this particular mapping doesn’t exist. I am trying to find out whether I am “asking the wrong question”, i.e. there is a good reason why that mapping does not exist and I like to understand it. On first glance at least, the mapping looks to be rather straight forward. Given the tremendous work that went into the Oncology extension, including substantial additions to the vocabulary, I wonder whether the omission was deliberate.
The reason is, in part, that the NAACCR code for ‘White’ only refers to race when used in combination with NAACCR code Race 2. That is White is the Answer to Race 2. But it looks like the code for White has a bunch of ‘Value to Schema’ relationships with other NAACCR codes. So you cannot map the NAACCR code for White to OHDSI race white without understanding how the NAACCR code ‘White’ is being used.
Thank you. The field I am sourcing the CDM data from is NACCR Race 1 (‘White’). I assume the correct value is Athena. If true, would the following be correct for a PERSON table mapping?
We are going to publish soon the NAACCR mapping. But because the variable-value problem for the source (sometimes even variable-variable-value) the simple “Maps to” relationship does not work. We need the Wide Mapping table. Stay tuned, please.
Yes Race concept_id 8527 is the correct mapping for NAACCR Race (White).
This query will give the you possible NAACCR race value.
select c1.concept_id, c1.concept_code, c1.concept_name
, relationship_id
, c2.concept_id, c2.concept_code, c2.concept_name
FROM concept_relationship
JOIN concept c1 on c1.concept_id = concept_id_1
JOIN concept c2 on c2.concept_id = concept_id_2
WHERE concept_id_1 = 35917100
and relationship_id = ‘Has Answer’;
And this is the query to get Standard OHDSI concepts for race.
select *
from concept
where domain_id = ‘Race’ and standard_concept=‘S’ AND invalid_reason is NULL;
You will have to determine what is the best mapping from NAACCR to OHDSI.
@Christian_Reich Thank you. I figured a mapping would be forthcoming. I will work on a temporary solution and update once the official solution is available.
@DTorok Thank you. I got a very similar query with the difference of constraining to 35917103 since I am sourcing from NAACCR Race 1. According to Athena 35917100 is for Race 2.
Happy New Year to both of you! I truly appreciate your support over the Holidays. Amazing!
Here is a preliminary mapping for NAACCR Race Vocabulary I came up with. Maybe it can contribute to the effort @Christian_Reich hinted at (despite the flaws it invariably has). This is for ‘Has Answer’ NAACCR Race 1 (Athena).
NAACCR_Concept_ID
NAACCR_Concept_Name
Standard_Concept_ID
Standard_Concept_Name
719371
American Indian, Aleutian, or Eskimo (includes all indigenous populations of the Western hemisphere)
8657
American Indian or Alaska Native
35943624
Asian Indian
38003574
Asian Indian
35941279
Asian Indian or Pakistani, NOS (code 09 prior to Version 12)
38003574
Asian Indian
719372
Black
38003598
Black
35940300
Chamorro/Chamoru
38003611
Micronesian
719376
Chinese
38003579
Chinese
35911258
Fiji Islander
38003610
Polynesian
719375
Filipino
38003581
Filipino
35940822
Guamanian, NOS
38003611
Micronesian
719374
Hawaiian
8557
Native Hawaiian or Other Pacific Islander
35941363
Hmong
38003582
Hmong
719370
Japanese
38003584
Japanese
35940912
Kampuchean (Cambodian)
38003578
Cambodian
719373
Korean
38003585
Korean
35940871
Laotian
38003586
Laotian
35940633
Melanesian, NOS
38003612
Melanesian
35941269
Micronesian, NOS
38003611
Micronesian
35941199
New Guinean
38003612
Melanesian
35940933
Other
0
Other
35941596
Other Asian, including Asian, NOS and Oriental, NOS
Except: We have to make the decision at the community level, but I am tending to abolish all the ethnic “races”. And leave the standard 5 ones only. Ethnicities are impossible to figure out, like in this case. Hierarchical relationships to races are even more ridiculous. As a consequence, studies with ethnicities will have to use source concepts, like NAACCR’s.
@Christian_Reich I agree and I am just trying to deal with the data reality at my disposal.
Unnecessary complexity should be avoided and granularity be limited to a level where values can be objectively determined with reasonable means. Ethnicity might just be an inferior proxy to cultural, socio-economic, genetic, etc. determinants of health. It will take a community with much richer experience than mine to come up with something useful.
I have done the same for ethnicity and gender. Maybe less flawed than the mapping for race.
I am posting it here in case it is useful for somebody else. Of course also to get help in case the mapping is bad. Thank you in advance.
Ethnicity:
naaccr_concept_id
naaccr_concept_name
standard_concept_id
standard_concept_name
35914498
Unknown whether Spanish or not
0
Unknown
35940262
Dominican Republic
38003563
Hispanic or Latino
35940448
Mexican (includes Chicano)
38003563
Hispanic or Latino
35940906
Spanish, NOS
38003563
Hispanic or Latino
35941390
Other specified Spanish/Hispanic origin (includes European; excludes Dominican Republic)
38003563
Hispanic or Latino
35941643
Non-Spanish; non-Hispanic
38003564
Not Hispanic or Latino
35941814
Puerto Rican
38003563
Hispanic or Latino
35941961
Cuban
38003563
Hispanic or Latino
35941988
South or Central American (except Brazil)
38003563
Hispanic or Latino
35943612
Spanish surname only (Code 7 is ordinarily for central registry use only, hospital registrars may use code 7 if using a list of Hispanic surnames provided by their central registry; otherwise, code 9 ‘unknown whether Spanish or not’ should be used.) The
Hang on a second. This is trickier. The ethnicities we have imported from the OMB are part of the race_concept_id. And second, ethnicity as in ethnicity_concept_id right now follows the US system, in which it just means Latino or not (because the Latinos have the same or similar race composition as the non-Latinos). So, being a US citizen or a Mexican citizen does not specify if you are Latino or not. However, either may have an ethnicity as part of the race_concept_id.
I know. We need to change that system. And soon. People don’t get it in the US, but they certainly don’t get it outside the US.
The sex concepts I think you got right.
Possible, but I am getting the feeling that there isn’t anything. It would have materialized by now. The race/ethnicity/socio-economic-genetic-cultural backgrounds are not precisely defined. They are wishy-washy concepts. You cannot unequivocally define them using criteria. Instead, they are self-defined, without clear criteria. As such, it is very difficult to use them for unbiased research.
Sure thing (race and ethnicity are highly subjective measures tainted by a host of motivations). It’s time we move on to more objective measure in the age of precision medicine.
Maybe I should not try and map them at all?
Hi @hannes , in the upcoming call of the Vocabulary Subgroup (separate meeting but to be found as part of the Common Data Model Workgroup) on Jan 18th, @Jake will present his findings around Race and Ethnicity and I expect to see a summary as well as maybe some new perspectives on this topic.
Cheers ~ Mik
Sorry, Hannes - missed that reply. Check out the Common Data Model Workgroup and find the Vocabulary Subgroup. If you have an account in the OHDSI teams, I can add you to the invite.
We’re planning on using the above mappings you’ve created, and possibly modifying. I’ll take a look at the CDM subgroup from 2022 to see what the group decided regarding race/ethnicity. However, since you’ve instantiated the mappings, how has it been working for you, regarding the use case you were trying to solve (possibly for end users)?
I’m planning to ingest these (or similar) mappings as custom concept_relationship entries, and just wanted to see if you found the ingestion useful.