OHDSI Home | Forums | Wiki | Github

How to Record that ETL applied date shifting to obscure personal information

I entered this as a Themis issue, but moving to forum for discussion.

Some ETL’s apply date shifting to conform to PHI rules. How should this be documented? This was brought up at May 2019 Themis face to face.

@ croeder
Moved your comment from Themis issue to the forum.

I have a similar issue when importing data from an anonymized trial. I only have age at randomization, and can only make a poor guess of the year of randomization.

Hm. What is the problem here? Why does it need to be documented? The analytics work just the same. The patient is kept anonymous. It’s an ETL action that happens before the “birth” of the database.

Thanks, I wondered if I shouldn’t look here.

At the level of the year of care, someone might want to try and identify what the standard of care was at that time, if there is some detail that escaped the CRFs. I’ve been planning on using an arbitrary date as randomization date, and add the day offsets of visits so I can store the relative time as dates.

Wait wait wait. Year? I thought the shuffling is a couple weeks up and down. Not years. Are people doing date shifting by years now?

For my own work, it’s possible. A CT may last many years and if the data is anonymized, I don’t have the date of randomization, only the date the trial started. I’m not anonymizing EHR data. I’m representing anonymized CT data.

t