I know I'm more guilty than most of being sometimes loose with my language
about this topic, but for clarity sake, I'll try to be precise here: I
think it'd be more proper to think about is as: 'you should only map your
source codes to standard concepts, and each standard concept belongs to one
domain, which thereby tells you where to put the source code'. That is,
don't think about it as each domain has one standard vocabulary, but rather
each domain has a set of standard concepts, which together may come from
one or more vocabularies. Also, don't assume you know the domain of your
source code until you've mapped it to a standard concept: ICD9 diagnosis
codes do not all go to CONDITION_OCCURRENCE, because some of the codes are
in fact procedures (e.g. cancer screening) or observations (e.g. family
history of..) Some CPT/HCPCS codes actually reflect drug use, not
procedures, etc. Instead, you can consistently apply the following steps
as part of your ETL process:
1) find the source concept id associated with your source code (look up in
CONCEPT table, commonly looking in CONCEPT_CODE for the source code
string),
2) map your source concept to a standard concept using a valid
CONCEPT_RELATIONSHIP record,
3) look up the domain_id of the standard CONCEPT_ID,
4) ETL your source data into the standard domain, storing the source value,
source concept_id, and standard concept_id along with any domain_specific
attributes.