Non-Human Subjects Research OMOP IRB Protocol

callahantiff · May 7, 2015, 4:26pm

Good Morning-

We recently received non-human subject research approval for creating a de-identified OMOP data repository at the University of Colorado Denver | Anschutz Medical Campus. An achievement that could not have been possible without some inspiration from existing Stanford and Columbia IRB protocols.

We are excited that we were able to do this and want to make our protocol available to others who are hoping to do the same. You can access the protocol on the OHDSI Wiki’s Documentation page.

If you have any questions or want to discuss this further, please don’t hesitate to contact me.

MauraBeaton · May 7, 2015, 4:53pm

Thank you Tiffany!

For anyone else who would like to upload their protocols on the wiki but
isn’t sure how, feel free to reach out to me (beaton@ohdsi.org) and I can
walk you through it.

Vojtech_Huser · May 7, 2015, 6:18pm

Thank you for sharing the IRB for the other sites!
For those that want a quick review of the most important part, I pasted the main points below (e.g., re-shifting of dates during each refresh):

We will utilize the following techniques to minimize risk and protect patient confidentiality:

Identifier Generation: Anonymized, masked patient identifiers will have already been assigned to each record (i.e., MRNs will not be visible) as part of the initial transfer of data into the CHCO OMOP CDM limited data repository.
When data is transferred into the OMOP CDM de-identified data repository the masked identifiers will be rerandomized and a completely new set of identifiers will be created. Once this transformation and re-randomization
is complete the mapping tables will be destroyed.
Date Shifting: Obfuscation will be used to de-identify dates each time data is transferred from the CHCO OMOP CDM limited data repository to the OMOP CDM de-identified data repository. This obfuscation process involves
adding a random number of days to all dates associated with a patient; each patient’s dates are shifted by a different random number of days. A uniform random distribution will be used to generate a random number of days
to shift dates between -60 to +60 days (a 120 day interval). Once the mapping information used to shift the dates is destroyed the process cannot be reversed. Since date obfuscation may potentially impact aggregate numbers (i.e., can cause patients to shift into or out of time intervals), analyses and interpretation of findings will be known to be approximate, which adds an additional layer of patient privacy protection.
Mapping procedures: The mappings used to re-randomize patient identifiers and shift patient dates utilized each time data is refreshed and transferred from the CHCO OMOP CDM limited data repository to the OMOP CDM deidentified data repository will be destroyed after the transformations are complete. Thus there will be no viable link that could be used to identify patients included in the OMOP CDM de-identified repository. Further, since the mappings used to transform the data will be destroyed each time new data is incorporated into the repository, each transform will result in a new set of shifted dates and re-randomized identifiers unique to that particular data transfer.

In summary, our approach applies multiple layers of protection; a de-identified OMOP CDM data repository within the CHCO secure computing environment where only de-identified portions of data are exported. Moreover, all identifiers in structured data are masked (i.e., provided randomly assigned identifiers and utilization of only time altered dates) at all times. The combination of these measures ensures several layers of protection for patient data.

jon_duke · May 7, 2015, 8:58pm

This is great! Thank you @callahantiff!

To avoid the IRBs getting lost amidst the software documentation, I have moved the IRB’s to a dedicated page over in the Research Section of the Wiki and have added them to the research navigation bar.

Jon