PowerPoint Presentation
Part 2: BasicsMarianne Winslett1,4, Xiaokui Xiao2, Gerome
Miklau3, Yin Yang4, Zhenjie Zhang4,
1 University of Illinois at Urbana Champaign, USA2 Nanyang
Technological University, Singapore3 University of Massachusetts
Amherst, USA4 Advanced Digital Sciences Center,
SingaporeAgendaBasics: XiaokuiQuery Processing (Part 1):
GeromeQuery Processing (Part 2): YinData Mining and Machine
Learning: ZhenjieFormulation of PrivacyWhat information can be
published?Average height of US peopleHeight of an
individualIntuition:If something is insensitive to the change of
any individual tuple, then it should not be considered
privateExample:Assume that we arbitrarily change the height of an
individual in the USThe average height of US people would remain
roughly the samei.e., The average height reveals little information
about the exact height of any particular individual
34
ratio5100%# of diabetes patients6100%# of diabetes
patients7100%# of diabetes patients8ratio bounded# of diabetes
patients9Laplace Distribution
10Differential Privacy via Laplace Noiseratio bounded# of
diabetes patients11Differential Privacy via Laplace Noise# of
diabetes patients12Differential Privacy via Laplace Noise# of
diabetes patients13Differential Privacy via Laplace Noise# of
diabetes patients14Differential Privacy via Laplace Noise# of
diabetes patients15Differential Privacy via Laplace Noise# of
diabetes patients16Differential Privacy via Laplace Noise# of
diabetes patients17Differential Privacy via Laplace Noise# of
diabetes patients18Differential Privacy via Laplace Noise# of
diabetes patients19Differential Privacy via Laplace
Noise20Differential Privacy via Laplace Noise21Sensitivity of
Queries22
Geometric MechanismLaplaceGeometric23Exponential
Mechanism24Exponential Mechanism (cont.)25Exponential Mechanism
(cont.)26Composition of Differential Privacy27Variants of
Differential Privacy28Variants of Differential Privacy29Variants of
Differential Privacy30Variants of Differential Privacy
31Limitations of Differential PrivacyDifferential privacy tends
to be less effective when there exist correlations among the
tuplesExample (from [Kifer and Machanavajjhala 2011]):Bobs family
includes 10 people, and all of them are in a databaseThere is a
highly contagious disease, such that if one family member contracts
the disease, then the whole family will be infectedDifferential
privacy would underestimate the risk of disclosureSummary: Amount
of noise needed depends on the correlations among the tuples, which
is not captured by differential privacy32