OHDSI Home | Forums | Wiki | Github

Meaning and expected value for source_concept_id in source_to_concept_map table

Hello,

I have been trying to use the source_to_concept_map table to associate health survey data to the CDM model. Could someone please explain to me what do we store in the source_concept_id column in this table?

For instance if my source_code columns have values such as “Q01” and “Q02”, do I use the same values for the source_concept_id column as well? I cannot see any other source data field that I could use for this OMOP column.

Additionally what are we expected to store in the source_vocabulary_id column?

Regards,
Vimala

Hi, @Vimala_Jacob,

My understanding:
source_concept_id is supposed to be a concept_id from the vocabulary. If it’s custom codes you’re mapping, then the source_concept_id should be set to 0. You should create your own internal vocabulary_id for the ‘source_vocabulary_id’ column. You should try to give your source code a description in the source_code_description but you can always re-use the source_code value there.

When would you have a non-zero source concept Id for your source codes? Imagine a case where there’s an EHR system that stores some of the medical data in an encoded format in a file such as ICD9:696.1. Now you could always have your ETL parse out the field to figure out what kind of code this is and map it directly during your etl, or you could create rows in your source_to_concept_map table for this example:
source_code: ICD9:696.1
source_concept_id 44819938 – Other psoriasis
source_vocabulary_id: CUSTOM_EHR_DIAG
target_concept_id: 75614 – Acrodermatitis continua
target_vocabulary_id : SNOMED

Then during your ETL you resolve the column value above to a soruce concept of the ICD9 and a target concept of the snomed concept.

Note to @Christian_Reich: Not sure why we need a target_vocabulary_id in this table, by specifying a target concept_id, you shoudl be able to look up the target concept’s vocabulary via a join to the concept table. Is there some other purpose of this column?

1 Like

@Chris_Knoll:

Correct. The target_vocabulary_id is redundant and a remnant from the past. We should drop it. However, since the SOURCE_TO_CONCEPT_MAP table is really a local vehicle for ETLers (actually, many ignore it and use their own tables), and has no purpose in any analytical use cases, nobody really cares.

1 Like
t