OHDSI Home | Forums | Wiki | Github

Using USAGI to map gender, race, and ethnicity?

I’m just starting to implementing OMOP and am starting with the person table. I’d like to verify that I am doing things correctly before I get too far.

First of all, am I correct that USAGI is the correct tool for mapping custom codes like race, gender, and ethnicity? Is this what people are doing generally, or do most just map these in their ETL?

Secondly, if for say “Asian or Pacific Islander” (in race) can I map to two different concepts in the Target Concepts, or should I map to only one? (see below)


At present, I limiting it to one. (In this case Other Pacific Islander).

Third, I am assuming that things like “Other” and “Multiracial”, which don’t have a matching concept should be left unmapped.

Fourth, I assume that because they are unmapped, I also should leave them unapproved, so they won’t be included in the Export source_to_concept_map output.

Fifth, I discovered that before I could append the data to “source_to concept_map”, I had to create a related field in “vocabulary”, which I did, but I could not figure out what, if any, value should go into vocabulary_concept_id. Is this needed? I just entered 0 (zero).

Sixth, I was then able to append the exported files from USAGI into the vocabulary_concept_id. However, to use that table, I had to JOIN it in the SQL statement as follows:

INNER JOIN
	PATIENT_RACE AS z_race
			ON
				z_race.PATIENT_RACE_C = p_race.PATIENT_RACE_C
LEFT OUTER JOIN
		OMOP.source_to_concept_map AS source_to_concept_map_race
			ON
				source_to_concept_map_race.source_code = z_race.PATIENT_RACE_C
                     AND source_to_concept_map_race.source_vocabulary_id = 'SH_race'

The final “AND” above being necessary to match correctly to race.

Am I doing this correctly? Or am I missing something obvious?

Thanks.

Roger Carlson

If you are new to OMOP and/or USAGI - trying things out using gender, race, and ethnicity is probably a good start. Given the relative low complexity of these three patient data elements - I would scan a 1 year dataset and dump out all the variations contained in the EHR. Then do a simple ETL mapping as part of the person table loader.

I tend to think of using tools like USAGI for more complex data like DrugExposure, DeviceExposure, Observations, … from structured and unstructured data sources where the complexity of the EHR data is much higher and tools like USAGI would be more helpful.

Regarding mapping a local value to multiple concepts - I personally don’t like to throw information away and so would define - in general where possible - concept_relationship entries to support a path for a local concept “A/B” to be related to two standard concepts “A” and “B” via the concept_relationship table. Others might have a more conservative view and choose to select the most dominant standard concept to map “A/B” to instead - “A/B” -> “A”.

t