I am new to OMOP but am currently interested in mapping to Clinical Practice Research Datalink (CPRD). I downloaded the pre-selected vocabularies on Athena, along with any others that seem relevant to my dataset (for example dm+d).
How do I know which vocabularies I need?
Once mapped I still have various codes in CPRD that are unmapped to any of the vocabularies I downloaded. Are there guidelines around which vocabularies I need, is there anything stopping me from downloading all of them? Or is it best guess?
Are you mapping to CPRD or is this your source dataset you’re converting to OMOP CDM format, or do you already have access to the OMOP CDM instance?
Usually, the coding systems are indicated in the data dictionaries and other documentation that comes together with the dataset. Sometimes you get a clear list; sometimes your decisions are based on tests and assumptions.
It might be the case that coding systems are still missing in OMOP, e.g. there’s Read vocabulary, which is Read v.2, but if you only have v.3 in the source data, you’d need a custom vocabulary implementation.