OHDSI Home | Forums | Wiki | Github

Deduplicating the measurement table (is provider_id clinically relevant?)

Hey,

I am currently trying to deduplicate our measurement table and want to make sure that I am not removing any clinically relevant information.

There cases where two providers will order the same measurement within a short timespan. When this happens, only one measurement is taken but we have two rows for that measurement occurrence. The provider_id and measurement_order_datetime will be different but all the variables to do with the actual measurement are the same (i.e. person_id, visit_occurrence, measurement_datetime, value_as_number etc).

Is the provider_id clinically relevant? Or am I able to just bring in one of these rows?

@Hannah_Morgan-Cooper:

Welcome to the family.

Yeah, this happens because we are overloading the Measurement record. We use it to indicate the fact that a test is ordered, and the result. Logically, you cannot dedup in this situation. However, you could make it “worse” by triplicating it:

  • Write two records with the providers but no results (i.e. the order records)
  • Write one record without providers but with the results

Want to do that? The use cases for these records are different. You could even put in the correct order and result dates, if they are not the same day.

t