[quote=“MPhilofsky, post:7, topic:9690”]
- MEASUREMENT.VALUE_AS_CONCEPT_ID
- MEASUREMENT.UNIT_CONCEPT_ID
- MEASUREMENT.OPERATOR_CONCEPT_ID
- OBSERVATION.VALUE_AS_CONCEPT_ID
- OBSERVATION.UNIT_CONCEPT_ID
concept_id = 0 identifies a source code/value/idea that doesn’t have a matching concept_id.
NULL generally means the data are absent from the source.[/quote]
While NULL may generally mean the data is missing from the source, it actually means UNKNOWN, that is, we don’t know whether or not there is data, let alone what value it might have. NULL cannot ever match anything. I’ve seen the “values of NULL” discussion, and I don’t buy it. You can never say anything positive about NULLs. Once you say something positive about a NULL, it’s no longer NULL. Therefore zero is appropriate because it speaks only to the negative: the field literally does not have a matching concept.
Conflicts
I agree that if you don’t have values for .concept_id or .source_concept_id, you don’t have useful data.
However, all of the “type_concept_id” fields are ARE Required even though the value may not be available in the source.
- Condition_occurrence.condition_type_concept_id
- Procedure_occurrence.procedure_type_concept_id
- Drug_exposure.drug_type_concept_id
- Device_exposure/device_type_concept_id
- Measurement.measurement_type_concept_id
- Note.note_type_concept_id
- Observation.observation_type_concept_id
- Specimen.specimen_type_concept_id
- Visit.visit_type_concept_id
It is perfectly possible to not have a type_concept_id in the Source system. I may not be able to determine whether a procedure is a primary or secondary type, but it is still useful data. But in this case, the field is not NULLable (ie. it is required) and so I have to substitute a zero, even when I don’t know what that value is.
Other concept_ids also ARE Required.
- Condition_occurrence.condition_status_concept_id
- Drug_exposure.route_concept_id
- Procedure_occurrence.modifier_concept_id
- Specimen.anatomic_site_concept_id
- Specimen.disease_status_concept_id
- Observation.obs_event_field_concept_id
- Note.note_class_concept_id
- Note.encoding_concept_id
- Note.language_concept_id
Again, these may not have values stored in the source, yet they are Required. In particular a procedure may not have a modifier at all, so why is the concept required? You may not have information about the anatomic site or disease status of a specimen. These all may have NULLs in the source, but are still required to map to zero.
Then there are the five above which are NOT Required
- Measurement.operator_concept_id
- Measurement.value_as_concept_id
- Measurement.unit_concept_id
- Observation.value_as_concept_id
- Observation.unit_concept_id
If the reason these five are NULLable is because some don’t have these attributes, that’s actually the opposite of NULL. BMI, for instance, does not have a unit.NULL means ‘unknown’, and we do know whether BMI has a unit. It does not. So putting a zero in the concept_id makes more sense than leaving it NULL. It does not have a matching concept and we know it does not. NULL implies we don’t know.
However, because the unit_concept_id field is NULLable, I can send a measurement value of 100 (for some other measure) without a unit. But is that milliliters or deciliters? From the perspective of someone trying to make sense of the data (ie. a researcher), that particular record is unusable whether there is a zero or a NULL.But if there can be both zeroes and NULLs in that field that means to them the same thing, then it takes extra evaluation.
But then there are some others which are also NOT Required
- Observation.qualifier_concept_id
- Specimen.unit_concept_id
if modifier_concept_id is required, why not qualifier_concept_id? Or vice versa.
So the decision to make some fields NULLable and some not seems to be exactly backwards to me.
I would argue that ALL concept_id fields should be Required. If NULLs are important, I also suggest that every Concept_id field be accompanied by a Source_Value field (as many of them do already), so if a researcher wants to know the NULL state, he or she can find it in the source value.