“If the DRUG_SOURCE_VALUE is coded in the source data using an OMOP-supported vocabulary, put the concept ID representing the source value here.”
So, I should put the same code DRUG_SOURCE_CONCEPT_ID == DRUG_CONCEPT_ID(or if it is not a standard concept, there will be the same DRUG_CONCEPT_ID and different concepts id for DRUG_SOURCE_CONCEPT_ID).
The field is not mandatory; what will happen if I leave all DRUG_SOURCE_CONCEPT_ID to null?
All fields with _source_concept_id in name are populated by references to entries encoded in classification system used by your source data – for example, NDC concepts, with their mapping target being drug_concept_id.
These fields are not used in OMOP tooling, and can be left empty. But they also can be used to account for where the data comes from.
and if say your source_concept_value is not in an omop vocabulary but you are able to map the source_concept_value to a concept_id, what should the source_concept_id be set to?
Yes, we’ve looked at ‘manual’ mapping the source to a target standard OMOP vocabulary where appropriate. Manual in the sense we take the source and target and run them through a mapping tool that users can manually check the validity of the map depending on the type of map made. e.g. if it’s a lexical match, then it’s unlikely worth manually checking these but if its another algorithm that’s suggested a map then the user may want to spend time checking. We have internal mapping tools (not usagi)
Can I ask for further clarification? I understood:
Example, there is a diagnosis code list in my db that is not related to any dictionary in athena.
I manually map the code, in the source_to_concept_map table I find the corresponding line with the related info:
source_code : My original code.
source_concept_id : 0 because there is no code (I have not created new ids in concept, this additional implementation is optional if there is a need to work with the original codes).
source_vocabulary_id: the name of the diagnosis code list I am referring to in my database, so the dictionary name does not exist in athena.
source_code_description: insert the description of the code if present.
target_concept_id: I assign the manually selected concept code.
target_vocabulary_id: I assign the dictionary code of the chosen target_concept_id.
valid_start_date: info relating to the selected target_concept_id.
valid_end_date: info relating to the selected target_concept_id.
invalid_reason: info relating to the selected target_concept_id.
Now I insert this concept into condition (first I create the related visit_occurrence and then I create the related condition_occurrence), with the fields defined as follows:
condition_occurrence.condition_concept_id: target_concept_id
condition_occurrence.condition_source_value: source_code
condition_occurrence.condition_source_concept_id : null
condition_occurrence.condition_status_source_value: null
I think it is necessary to populate condition_source_concept_id with values custom > billion as you indicated or with NULL, no zero.
Having previously understood, like you, to assign zero, then I got to the dqds and was failing these tests, for example:
The number and percent of records with a value of 0 in the source concept field CONDITION_SOURCE_CONCEPT_ID in the CONDITION_OCCURRENCE table. (Threshold=10%)
SELECT
‘CONDITION_OCCURRENCE.CONDITION_SOURCE_CONCEPT_ID’ AS violating_field, cdmTable.*
FROM cdm.CONDITION_OCCURRENCE cdmTable
WHERE cdmTable.CONDITION_SOURCE_CONCEPT_ID = 0
And from this I deduced that indicating that condition_source_concept_id = 0 means that the source is a non-mappable field, which is incorrect.
So my solution is:
insert an identifier for the new concepts in the concept table and populate the info in condition_source_concept_id
-leave the field null
The dreaded DQD “failure”. If you leave a required field NULL (don’t do this), you will also receive a DQD failure for nulling a required field. Here’s the thing. Everyone’s source data are different. If your data are not coded in one of the OHDSI supported vocabularies, then the best you can do is accurately map your source values to standard concepts. Research is done using the standard concepts in the <table_name>_concept_id field. As long as these are populated with standard concepts, your CDM is in good shape. Check out my poster here for more information about customizing your Data Quality Dashboard to support your health system to produce reliable, real-world evidence.
This poster explains and compares the two methods for mapping non-OHDSI supported codes or source values to standard concept_ids.