Top Banner
Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential Privacy
33

Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Dec 18, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Lecturer: Moni NaorJoint work with Cynthia Dwork

Foundations of PrivacyInformal Lecture

Impossibility of Disclosure Preventionor

The Case for Differential Privacy

Page 2: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Let’s Talk About Sex

Better Privacy Means Better Data

Page 3: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Private Data Analysis• Simple Counts and Correlations

– Was there a significant rise in asthma emergency room cases this month?– What is the correlation between new HIV infections and crystal meth usage?

• Holistic Statistics– Are the data inherently low-dimensional?

• Collaborative filtering for movie recommendations

• Beyond Statistics: Private Data Analysis– How far is the (proprietary) network from bipartite?

…while preserving privacy of individuals

Page 4: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Different from SFESecure Function Evaluation • participants collaboratively compute a function f of

their private inputs • E.g., = sum(a,b,c, …)

– Each player learns only what can be deduced from and her own input to f

• Miracle of Modern Science!• SFE does not imply privacy!

– Privacy only ensured “modulo f”• if and a yield b, so be it.

Page 5: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Cryptographic Rigor Applied to Privacy

• Define a Break of the System– What is a “win” for the adversary?– May settle for partial information

• Specify the Power of the Adversary– Computational power? “Auxiliary” information?

• Conservative/Paranoid by Nature– All breaks are forever– Protect against all feasible attacks

Page 6: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Dalenius, 1977• Anything that can be learned about a respondent from the

statistical database can be learned without access to the database– Captures possibility that “I” may be an extrovert– The database doesn’t leak personal information– Adversary is a user

• Analagous to Semantic Security for Crypto– Anything that can be learned from the ciphertext can be learned

without the ciphertext– Adversary is an eavesdropper

Goldwasser-Micali 1982

Page 7: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Outline

• The Framework• A General Impossibility Result

– Dalenius’ goal cannot be achieved

• The Proof– Simplified– General case

Page 8: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Two Models

Database Sanitized Database

?San

Non-Interactive: Data are sanitized and released

Page 9: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Two Models

Database

Interactive: Multiple Queries, Adaptively Chosen

?San

Page 10: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Auxiliary Information

Common theme in many privacy horror stories: • Not taking into account side information

– Netflix challenge: not taking into account IMDB [Narayan-Shmatikov]

Page 11: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Not learning from DBWith access to the database Without access to the database

San A

Auxiliary Information

San A’

Auxiliary Information

DB DB

There is some utility of DB that legitimate user should be able to learn

• Possible breach of privacy• Goal: users learn the utility without the breach

Page 12: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Not learning from DBWith access to the database Without access to the database

San A

Auxiliary Information

San A’

Auxiliary Information

DB DB

Want: anything that can be learned about an individual from the statistical database can be learned without access to the database

• 8 DD 8 A 9 A’ whp DB 2R D 8 auxiliary information z |Prob [A(z) $ DB wins] – Prob[A’(z) wins]| is small

Page 13: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Illustrative Example for ImpossibilityWant: anything that can be learned about a respondent from the

statistical database can be learned without access to the database

• More Formally 8 D 8 A 9 A’ whp DB 2R D 8 auxiliary information z |Probability [A(z) $ DB wins] – Probability [A’(z) wins]| is small

Example:• Aux z = “Kobi Oz is 10 cm shorter than average in DB”

– A learns average height in DB, hence, also Kobi’s height– A’ does not

• Impossibility Requires Utility – Mechanism must convey info about DB

• Not predictable by someone w/o access• “Hint generator” and A share secret, unknown to A’

Page 14: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Defining “Win”: The Compromise Function

Notion of privacy compromise

Compromise?

y

0/1

Adv

DB DD

Privacy breach

Privacy compromise should be non trivial:

•Should not be possible to find privacy breach from auxiliary information alone

Privacy breach should exist:

•Given DB there should be y that is a privacy breach

•Should be possible to find y

Page 15: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Additional Basic Concepts• Distribution on (Finite) Databases DD

– Something about the database must be unknown– Captures knowledge about the domain

• E.g., rows of database correspond to owners of 2 pets• Privacy Mechanism San(DD, DB)

– Can be interactive or non-interactive– May have access to the distribution D

• Auxiliary Information Generator AuxGen(DD, DB)– Has access to the distribution and to DB– Formalizes partial knowledge about DB

• Utility Vector w– Answers to k questions about the DB– (Most of) utility vector can be learned by user– Utility: Must inherit sufficient min-entropy from source D

Page 16: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Impossibility Theorem

Fix any useful* privacy mechanism San and any reasonable privacy compromise decider C. Then There is an auxiliary info generator AuxGen and an adversary

A such that for “all” distributions DD and all adversary simulators A’

Pr[A(D, San(D,DB), AuxGen(D, DB)) wins] - Pr[A’(D, AuxGen(D, DB)) wins] ≥ for suitable, large,

The probability spaces are over choice of DB 2R D D and the coin flips of San, AuxGen, A, and A’

Tells us information we did not know

To completely specify need assumption on the entropy of utility vector

Page 17: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Strategy• The auxiliary info generator will provide a hint that

together with the Utility Vector w will yield the privacy breach.

• Want AuxGen to work without knowing D just DB– Find privacy breach y and encode in z– Make sure z alone does not give y. Only with w

• Complication: is the utility vector w– Completely learned by the user?– Or just an approximation?

Page 18: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Entropy of Random Sources• Source:

– Probability distribution X on {0,1}n.– Contains some “randomness”.

• Measure of “randomness”– Shannon entropy: H(X) = - ∑ x Γ Px (x) log Px (x)

Represents how much we can compress X on the average

But even a high entropy source may have a point with prob 0.9

– min-entropy: Hmin(X) = - log max x Γ Px (x)

Represents the most likely value of X

{0,1}n

Page 19: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Min-entropy

• Definition: X is a k-source if Hmin(X)¸ k.

i.e. Pr[X=x] · 2-k for all x• Examples:

– Bit-fixing: some k coordinates of X uniform, rest fixed• or even depend arbitrarily on others.

– Unpredictable Source: 8 i2[n], b1, ..., bi-12 {0,1},

k/n· Prob[Xi =1| X1, X2, … Xi-1= b1, ..., bi-1] · 1-k/n

– Flat k-source: Uniform over S µ {0,1}n, |S|=2k

• Fact every k-source is convex combination of flat ones.

Page 20: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Extractors Universal procedure for “purifying” an imperfect sourceDefinition: a function Ext: {0,1}n £ {0,1}d {0,1}m

is a (k,)-extractor if:

8 k-sources X, Ext(X, Ud) is -close to Um.

d random bits

“seed”

EXT

k-source of length n

m almost-uniform bits

{0,1}n

2k stringsx

s

Page 21: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Strong extractors

Output looks random even after seeing the seed.

Definition: Ext is a (k,) strong extractor if Ext’(x,s)= s ◦ Ext(x,s) is a

(k,)-extractor

• i.e. 8 k-sources X, for a 1- ’ frac. of s 2 {0,1}d

Ext(X,s) is -close to Um

Page 22: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Extractors from Hash Functions

• Leftover Hash Lemma [ILL89]: universal (pairwise independent) hash functions yield strong extractors– output length: m = k-O(1)– seed length: d = O(n)Example: Ext(x,(a,b))=first m bits of a¢x+b in

GF[2n]

• Almost pairwise independence [SZ94,GW94]:– seed length: d= O(log n+k)

Page 23: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Suppose w Learned Completely

AuxGen and A share a secret: w

AuxGen(DB)• Find privacy breach y of

DB• Find w from DB

– simulate A

• Choose s2R {0,1}d and compute Ext(w,s)

Set z = (s, Ext(w,s)©y)

San

DB AuxGen

A

C

0/1

w

z

Page 24: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Suppose w Learned Completely

AuxGen and A share a secret: w

DB AuxGen

A’

C

0/1

San

DB AuxGen

A

C

0/1

w

z

z = (s, Ext(w,s) © y)

z

Technical Conditions: H1(W|y) ≥ |y| and |y| “safe”

Page 25: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Why is it a compromise?

AuxGen and A share a secret: w

Why doesn’t A’ learn y:• For each possible value of y(s, Ext(w,s)) is -close to

uniform• Hence: (s, Ext(w,s) © y) is -

close to uniform

San

DB AuxGen

A

C

0/1

w

z

z = (s, Ext(s,w) © y)

Technical Conditions: H1(W|y) ≥ |y| and |y| “safe”

Page 26: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

w Need Not be Learned Completely

Relaxed Utility: Something Close to w is LearnedAuxGen(DD, DB) does not know exactly what A will learnNeed that something close to w produces the same extracted randomness as wOrdinary extractors offer no such guarantee

Fuzzy Extractors (m,ℓ,t,): (Gen, Rec)

Gen(w) outputs extracted r 2 {0,1}ℓ and public string p. For any distribution W min-entropy at least m (R, P) Ã Gen(W) ) (R, P) and (Uℓ, P) are within stat distance

Rec(p,w*): reconstructs r given p and any w* sufficiently close to w (r, p) Ã Gen(w) and || w – w*||0 · t ) Rec(w*, p) = r.

Dodis, Reyzin and Smith

Page 27: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Construction Based on ECC

• Error-correcting code ECC:– Any two codewords differ by at least 2t bits

• Gen(w): p = w © ECC(r’)– where r’ is random

r is extracted from r’

• Given p and w’ close to w:– Compute w © p – Decode to get ECC(r’) – r is extracted from r’

2tw

ECC(r’)p

w’

Page 28: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

w Need Not be Learned CompletelyFuzzy Extractors (m,ℓ,t,): (Gen, Rec)

Gen(w) outputs extracted r 2 {0,1}ℓ and public string p. For any distribution W of sufficient min-entropy (R, P) Ã Gen(W) ) (R, P) and (Uℓ, P) are within stat distance

Rec: reconstructs r given p and any w* sufficiently close to w (r, p) Ã Gen(w) and || w – w*||0 · t ) Rec(w*, p) = r.

Idea: (r, p) Ã Gen(w); Set z = (p, r © y) A reconstructs r from w* close to wr looks almost uniform to A’ even given p

Problem: p leaks information about w - might disclose privacy breach y’;

Solution: AuxGen interacts with DB to learn safe w’ (r, p) Ã Gen(w’); Set z = (p, r © y)w’’ (learned by A) and w’ both sufficiently close to w

) w’, w’’ close to each other ) A(w’’, p) can reconstruct r.

By assumption w’ should not yield a breach!

Let be bound on breach

Page 29: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

w Need Not be Learned Completely

AuxGen and A share a secret: r

DB AuxGen

A’

C

0/1

San

DB AuxGen

A

C

0/1

w’’

z

(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)

z

r almost unif, given pp should not be disclosive

w’ w’

Page 30: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

w Need Not be Learned CompletelyPr[A’(z)] wins ≤ Pr[A $ San(DD, DB) wins] + ≤ +

DB AuxGen

A’

C

0/1

San

DB AuxGen

A

C

0/1

w’’

z

(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)

z

r almost unif, given pp should not be disclosive

w’ w’

Page 31: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

w Need Not be Learned CompletelyNeed extra min-entropy:

Hmin(W|y) ≥ L+|p|Pr[A’(z)] wins ≤ Pr[A $ San(D, DB) wins] +

≤ +

DB AuxGen

A’

C

0/1

San

DB AuxGen

A

C

0/1

w’’

z

(p, r) = Gen(w’)z = (p, r © y)A: r = Rec(p, w’’)

z

r almost unif, given pp should not be disclosive

w’ w’

Page 32: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Two Remarkable Aspects• Works Even if Kobi Oz not in Database!

– Motivates a definition based on increased risk incurred by joining the database, • Risk to Kobi if in database vs Risk to Kobi if not in DB• Cf: What can be learned about Kobi with vs w/o DB access

• Dalenius’ Goal Impossible but Semantic Sec Possible– Yet, definitions are similar.– Resolved by utility: the adversary is a user.

– Without auxiliary information:

• User must learn something from mechanism; – Simulator learns nothing

• Eavesdropper should learn nothing; – Simulator learns nothing

– What about SFE: the comparison is to an ideal party

Differential Privacy

Page 33: Lecturer: Moni Naor Joint work with Cynthia Dwork Foundations of Privacy Informal Lecture Impossibility of Disclosure Prevention or The Case for Differential.

Possible answer: Differential PrivacyNoticeable Relative Shift between

K (DB – Me) and K (DB + Me) ?If not, then no perceptible risk is incurred by joining DB.Anything adversary can do, it could do without Me.

Possible responses

Pr [response]

Bad Responses: X XX