Dealing with multiple races and other exceptions

aostropolets · March 4, 2024, 10:56pm

Filling you in on what’s been decided by community voting:

You put one race in PERSON, the rest - in OBSERVATION. Which - you decide. Only exception is case #4 where you replace unknown race with known race. Thy don’t like flavors of NULL here

Of course the researchers may miss other races and hopefully good ETL specs and conventions that say “Also look in OBSERVATION” will help. The thread above provides more reasons and thoughts on why community didn’t vote for modifying PERSON table (such as it would break all of the tools).

apitkus · March 4, 2024, 11:43pm

Thanks @aostropolets Wanted to make sure I didn’t miss any discussions in the thread as I saw support for both approaches.

It’d be good practice to treat each race the same way and why I favored the PERSON table. With the Observation approach, it appears one race would have an associated date and one may not.

@Christian_Reich asked for use cases.

Similar to Andrew’s, there are are variety of research projects looking at patient outcomes related to race. It depends on the type of research. Some of these are related to Social Determinants of Health too. Others are related to disease prevalence in certain races. Others yet are related to genetic changes that are more prevalent in one race or another.
Genetic testing for genes associated with one’s heritage are common.
The NIH has acknowleged a lack of diversity in a number of studies and is encouraging studies to be performed on more diverse populations. Thus researchers may search OMOP data sets to determine if they have sufficient numbers of patients with X race, Y race, Z race in their study population/institution or do they need to recruit patients from multiple academic centers/sites?

I also favor a post coordinated approach treating race and ethnicity of distinct items. They are collected separately (in the US) and usually mapped to different codesystems depending on the country. The CDC code system Davera mentioned would be used in the US, but other countries could map to their designated code system too. Many examples were given of different country needs.

I’m glad to see the group has taken up the topic and working on a solution.

With warm regards,
Andrea

Christian_Reich · March 7, 2024, 2:03pm

@apitkus:

Thanks for the use cases. The reason race information is spread over 2 locations is pragmatic: multiple race data are rare, and we don’t want to make the relatively rare use case (effect of multiple or changing race designations) easier at the cost of the much more common use case (using race as a single covariate for things). It also slows down computationally if you have to scan the entire OBSERVATION table for each patient.

We did have the debate if instead of tossing more than one race record into OBSERVATION we should pre-coordinate multiple races and keeping them in PERSON. But that idea was deemed infeasible, for the simple reason that nobody has those combinations ready, particularly if you go into fractions (7/16th Asian). Plus, some folks felt there are use cases about the dynamic character of that information (people changing their race designation).

But your use cases should be well covered, even though you need to do the extra step and screen two tables (PERSON and OBSERVATION). Let’s see what evidence you guys can detect.

MPhilofsky · March 12, 2024, 2:18am

We shouldn’t use date of birth because the current guidance to create Observation Period from EHR data says use the first event from the data. And unless you have pediatric data, the dob is before the EHRs were used. Using the date of the last visit record might be most accurate since race/ethnicity is usually recorded at every visit and updated accordingly. But we should discuss race & ethnicity conventions after April Olympians. There are many considerations to debate.

aostropolets · May 21, 2024, 7:35pm

@MPhilofsky I referenced the recommendations that the community voted for in the Themis post here: Convention need for how/where to store > 1 race or ethnicity concept_id · Issue #71 · OHDSI/Themis · GitHub.

Would be great to see a Themis convention for this long standing issue soon! Thanks for all your work.

Piper-Ranallo · January 11, 2025, 4:43pm

@MPhilofsky,
I see the resolution for race and ethnicity in Themis issue #71. We are updating our ETLs in accordance.

To @DaveraG’s point about CDC REC… was a decision ever made about adding these? I’m not seeing the codes in the vocabulary files.

Best,
Piper

MPhilofsky · January 13, 2025, 7:30pm

Hello @Piper-Ranallo!

The conclusion was for community members to contribute via the community contribution process. I don’t know if anyone has contributed the CDC codes. I checked the OHDSI Vocab GitHub, but I don’t see an issue for it. This issue was formally ratified after the August 2024 vocab release, so I wouldn’t expect them in any vocabulary files you currently have. The next release is February 2024.

Tagging Vocab guru @aostropolets for additional insight!

Piper-Ranallo · January 13, 2025, 8:11pm

Thanks @MPhilofsky.

Follow up question -

As I’m sure you know, SNOMED International recently inactivated all concepts in the | Ethnic group (ethnic group) | and all but 8 concepts in the | Racial group (racial group) | subhierarchies. We mapped some of our content to these concepts, including several we requested based on our data.

Once OMOP updates the vocabulary files with SNOMED changes, many of our source concept ids will be invalid.

Would it be possible for OHSDI to incorporate at least some of the SNOMED concepts that were active as of the release immediately prior to the mass inactivation of ethnicity and race concepts into the OMOP-specific ‘Race’ and ‘Ethnicity’ vocabularies?

Or would it be better for us to just request the concepts we are currently using?

Just not sure if other sites were using SNOMED concepts for race and ethnicities that don’t exists in any other terminology… and therefore will be in the same boat as we are.

Best,
Piper

Christian_Reich · January 13, 2025, 8:24pm

Hi @Piper-Ranallo:

No need to worry. We will put together a list of all ethnicities and races anybody has come across, including SNOMED. That thing will go into the next release. No de-duplication, as is. So, all your concepts will be there.

Will it be proper Closed World, i.e. unique and comprehensive? No, it won’t, for reasons discussed multiple times. The poor analyst wretch will have to make those highly redundant conceptsets. Is that a good situation? No, it isn’t. Can we fix it? No, we cannot. But it will be better than the current situation and good enough.

Piper-Ranallo · January 13, 2025, 9:21pm

Thank you @Christian_Reich!