Which dictionaries to download for CPRD?

Hi,

I am new to OMOP but am currently interested in mapping to Clinical Practice Research Datalink (CPRD). I downloaded the pre-selected vocabularies on Athena, along with any others that seem relevant to my dataset (for example dm+d).

How do I know which vocabularies I need?

Once mapped I still have various codes in CPRD that are unmapped to any of the vocabularies I downloaded. Are there guidelines around which vocabularies I need, is there anything stopping me from downloading all of them? Or is it best guess?

Thanks in advance for your help,
Lizzie

Hi Lizzie!

Are you mapping to CPRD or is this your source dataset you’re converting to OMOP CDM format, or do you already have access to the OMOP CDM instance?

Usually, the coding systems are indicated in the data dictionaries and other documentation that comes together with the dataset. Sometimes you get a clear list; sometimes your decisions are based on tests and assumptions.

It might be the case that coding systems are still missing in OMOP, e.g. there’s Read vocabulary, which is Read v.2, but if you only have v.3 in the source data, you’d need a custom vocabulary implementation.

Nothing should stop you from downloading and exploring all the vocabularies. You might also want to Google what was achieved in the past:
https://healthdatagateway.org/en/dataset/1100
https://www.sciencedirect.com/science/article/pii/S2352914823002538
https://ohdsi.github.io/ETL-LambdaBuilder/docs/CPRD