OHDSI Home | Forums | Wiki | Github

Community Meeting 18Aug2015

Good morning all,

We’ve got a great line-up for Tuesday’s community call:

  • Introductions & Updates
  • Discussion - @Mary_Regina_Boland will give us an overview of her work on the birth-month study: Investigating Birth Month-Disease Risk Relationships Across the OHDSI Network
  • For more information about the study, check out the wiki page
  • You can also check out the code in Github and join in the forum discussion

All the details you’ll need to join the call can be found here: http://www.ohdsi.org/web/wiki/doku.php?id=projects:ohdsi_community

If there are any other topics of discussion you’d like to raise on Tuesday’s call, please let us know in this thread.

Cheers,
Maura

There was a discussion about de-identification.
I mentioned a threshold of 20 thousand for ZIP code.
“Some convention for the value of k in k-anonymity”.

Here is an extract from a AMIA conference paper

This masking phenomenon is well described by the concept of k-anonymity 
(each dataset record is indistinguishable from at least k-1 other 
records given a group of identifying attributes) and the concept of 
l-diversity (each group of identifying attributes is immune to 
probabilistic inference attack and has at least l well represented 
values).  There is no clear established boundary; however, the HIPAA ZIP code 
rules offer one potential precedence: HIPAA zip code rule (45 CFR 
164.514) permits revealing 3-digit ZIP codes as long as the 3-digit ZIP 
code covers an area populated by more than 20,000 people, as this is 
considered to be sufficient “masking” of the individual. The masking 
principle is important in redacting or preserving a sentence, such as, 
“Patient has a 9-year-old daughter” in a document that otherwise 
contains unmodified dates and locations, but does not contain the 
primary patient name.

This website: http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/guidance.html#zip
contains the ZIP code rule in detail:

Covered entities may include the first three digits of the ZIP code if,
according to the current publicly available data from the Bureau of the
Census: (1) The geographic unit formed by combining all ZIP codes with
the same three initial digits contains more than 20,000 people;

t