The person.person_source_value column makes the person table the only table that supports a kind of row level provenance tracking. From the documentation
'An (encrypted) key derived from the person identifier in the source data. This is necessary when a use case requires a link back to the person data at the source dataset."
I would like to be able to have row level provenance tracking available on all the standardized clinical tables. My idea is the following:
- Add cdm_source_id to cdm_source
- Add cdm_source_id to every to every standardized clinical table.
- Add X_row_source_value on every standardized clinical table.
This would help the CDM support row level provenance tracking across a CDM spanning multiple source systems. It would also help aid anybody trying to do incremental loads of a CDM. The combination of X_row_source_value and cdm_source_id could be a stable identifier that would allow one to not to have to resort to truncate and load.