OHDSI Home | Forums | Wiki | Github

Dealing with multiple races and other exceptions

Filling you in on what’s been decided by community voting:

You put one race in PERSON, the rest - in OBSERVATION. Which - you decide. Only exception is case #4 where you replace unknown race with known race. Thy don’t like flavors of NULL here :slightly_smiling_face:

Of course the researchers may miss other races and hopefully good ETL specs and conventions that say “Also look in OBSERVATION” will help. The thread above provides more reasons and thoughts on why community didn’t vote for modifying PERSON table (such as it would break all of the tools).

Thanks @aostropolets Wanted to make sure I didn’t miss any discussions in the thread as I saw support for both approaches.

It’d be good practice to treat each race the same way and why I favored the PERSON table. With the Observation approach, it appears one race would have an associated date and one may not.

@Christian_Reich asked for use cases.

  1. Similar to Andrew’s, there are are variety of research projects looking at patient outcomes related to race. It depends on the type of research. Some of these are related to Social Determinants of Health too. Others are related to disease prevalence in certain races. Others yet are related to genetic changes that are more prevalent in one race or another.

  2. Genetic testing for genes associated with one’s heritage are common.

  3. The NIH has acknowleged a lack of diversity in a number of studies and is encouraging studies to be performed on more diverse populations. Thus researchers may search OMOP data sets to determine if they have sufficient numbers of patients with X race, Y race, Z race in their study population/institution or do they need to recruit patients from multiple academic centers/sites?

I also favor a post coordinated approach treating race and ethnicity of distinct items. They are collected separately (in the US) and usually mapped to different codesystems depending on the country. The CDC code system Davera mentioned would be used in the US, but other countries could map to their designated code system too. Many examples were given of different country needs.

I’m glad to see the group has taken up the topic and working on a solution.

With warm regards,
Andrea

@apitkus:

Thanks for the use cases. The reason race information is spread over 2 locations is pragmatic: multiple race data are rare, and we don’t want to make the relatively rare use case (effect of multiple or changing race designations) easier at the cost of the much more common use case (using race as a single covariate for things). It also slows down computationally if you have to scan the entire OBSERVATION table for each patient.

We did have the debate if instead of tossing more than one race record into OBSERVATION we should pre-coordinate multiple races and keeping them in PERSON. But that idea was deemed infeasible, for the simple reason that nobody has those combinations ready, particularly if you go into fractions (7/16th Asian). Plus, some folks felt there are use cases about the dynamic character of that information (people changing their race designation).

But your use cases should be well covered, even though you need to do the extra step and screen two tables (PERSON and OBSERVATION). Let’s see what evidence you guys can detect.

We shouldn’t use date of birth because the current guidance to create Observation Period from EHR data says use the first event from the data. And unless you have pediatric data, the dob is before the EHRs were used. Using the date of the last visit record might be most accurate since race/ethnicity is usually recorded at every visit and updated accordingly. But we should discuss race & ethnicity conventions after April Olympians. There are many considerations to debate.

@MPhilofsky I referenced the recommendations that the community voted for in the Themis post here: Convention need for how/where to store > 1 race or ethnicity concept_id · Issue #71 · OHDSI/Themis · GitHub.

Would be great to see a Themis convention for this long standing issue soon! Thanks for all your work.

1 Like
t