OHDSI Home | Forums | Wiki | Github

UK Biobank vocabulary release


(Oleg Zhuk) #1

At the end of 2020 one important vocabulary release took place: UK Biobank. This became possible due to a joint effort of the OHDSI UK Biobank Working Group and the UKB Pharma consortium, led by the Regeneron Genetics Center.

UK Biobank is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants.

At the moment we introduced 18438 new concepts (15733 Observations, 2701 Measurements and 4 Meas Values). We did exclude Genomics data, some Additional exposures: Cardiac monitoring concepts, and Health related outcomes (except for a few fields coming from the HES dataset) in this release.

Among 18438 concepts, 4787 (26%) are standard because they’re Questions/Answers or Variables/Values and should be stored in the Standard “area” according to the current Survey data conventions.

Work on UK Biobank impacted vocabulary practice on Survey and other EAV-type data and highlighted the needs of evolving the whole mapping approach.

We introduce a new concept_class for Question-Answer and Variable-Value pairs with concept_class_id Precoordinated Pair.

According to the approach used, these concepts represent combinations that were mapped to the trivial event-like concepts in the existing vocabularies. Hence, they are Non-standard with mapping to Standard concepts. Also, they’re linked back to the initial entities they were derived from within the “Has precoord pair”/”Precoord pair of” links. This approach was applied in the following UKB categories:

However, there is a substantial number of concepts that were mapped in a generic way (directly to Standard targets) by ‘Maps to” links. These categories are:

It’s important to know (and address in the ETL) which specific mapping approach was used in the certain UKB categories.

UKB Categories together with a hierarchy between them are isolated into a separate concept class and linked to the associated concepts by “Has Category”/”Category of” links.

Since the unit is a property of the Variable in the UKB vocabulary, they’re mapped to the corresponding Standard units by “Maps to unit” relationships.

The existing relationships between concepts can be summarized in the schema:


t