Privacy of profile-based ad targeting
Alexander SmalandIlya Mironov
2
User-profile targeting• Goal: increase impact of your ads by targeting a group
potentially interested in your product.• Examples:
• Social NetworkProfile = user’s personal information + friends
• Search EngineProfile = search queries + webpages visited by user
Privacy of profile-based targeting
3
Facebook ad targeting
Privacy of profile-based targeting
4
Characters
Privacy of profile-based targeting
My system is private!
Advertising company Privacy researcher
5
Simple attack [Korolova’10]
Targeted ad
Public:- 32 y.o. single man- Mountain View, CA- ….- has cat
Private:- likes fishing
Show
- 32 y.o. single man- Mountain View, CA….- has cat- likes fishing
Amazing cat food for $0.99!
Nice!Likes fishing
# of impressionsLikes fishing
noise
Privacy of profile-based targeting
Jon
Eve
6Privacy of profile-based targeting
My system is private! Unless your
targeting is not private, it is not!
How can I target
privately?
Advertising company Privacy researcher
7
How to protect information?• Basic idea: add some noise
• Explicitly• Implicit in the data
• noiseless privacy [BBGLT11] • natural privacy [BD11]
• Two types of explicit noise• Output perturbation
• Dynamically add noise to answers• Input perturbation
• Modify the database
Privacy of profile-based targeting
8Privacy of profile-based targeting
I like input perturbation
better…
Advertising company Privacy researcher
9
Input perturbation• Pro:
• Pan-private (not storing initial data)• Do it once• Simpler architecture
Privacy of profile-based targeting
10Privacy of profile-based targeting
I like input perturbation
better… Signal is sparse and non-random
Advertising company Privacy researcher
11
Adding noise• Two main difficulties in adding noise:
Privacy of profile-based targeting
1 0 0 0 1 0 0 0 1 0
0 0 0 0 0 0 0 0 1 1
0 0 0 1 0 0 0 0 0 1
0 0 0 1 0 0 1 0 0 0
0 1 1 0 0 0 0 0 0 1
0 0 0 0 1 0 0 1 0 0
0 0 1 0 0 0 0 1 1 0
0 0 0 0 0 0 0 1 0 1
1 0 1 0 0 0 0 0 0 0
0 1 0 1 0 1 0 0 0 0
• Sparse profiles1 0 0 1 1 1 0 1 0 0
0 1 0 0 0 0 0 0 1 1
1 0 0 0 0 0 0 0 1 0
1 0 0 1 1 1 1 0 0 0
0 1 1 0 0 0 0 0 1 1
1 0 0 0 1 0 0 1 0 0
0 1 1 1 1 1 0 1 1 1
0 1 0 0 0 0 0 1 0 0
1 0 1 1 1 1 0 0 0 0
0 1 0 0 0 0 0 1 0 1
• Dependent bits
1 1 1 1 1 0 0 1 0 1
differential privacy
deniability
“Smart noise”
12Privacy of profile-based targeting
I like input perturbation
better… Signal is sparse and non-random
Let’s shoot for deniability, and add
“smart noise”!
Advertising company Privacy researcher
13
“Smart noise”• Consider two extreme cases
• All bits are independentindependent noise
• All bits are correlated with correlation coefficient 1correlated noise
• “Smart noise” hypothesis: “If we know the exact model we can add right noise”
Privacy of profile-based targeting
Aha!
14
Dependent bits in real data• Netflix prize competition data
• ~480k users, ~18k movies, ~100m ratings• Estimate movie-to-movie correlation
• Fact that a user rated a movie• Visualize graph of correlations
• Edge – correlation with correlation coefficient > 0.5
Privacy of profile-based targeting
15
Netflix movie correlations
Privacy of profile-based targeting
16Privacy of profile-based targeting
Let’s construct models where “smart noise”
fails
Let’s shoot for deniability, and add
“smart noise”!
Advertising company Privacy researcher
17
How can “smart noise” fail?
Privacy of profile-based targeting
large = relative distance
large
18
Models of user profiles• hidden independent bits• public bits
• Public bits are some functions of hidden bits
Privacy of profile-based targeting
1 0 1 … 0 1
1 1 0 1 … 0 1 0 1
𝑓
• Are users well separated?
hidden bits
public bits
19Privacy of profile-based targeting
Error-correcting codes• Constant relative distance• Unique decoding• Explicit, efficient
20Privacy of profile-based targeting
But this model is unrealistic!
See — unless the noise is >25%, no privacy
Let me see what I can do with monotone
functions…
Advertising company Privacy researcher
21
Monotone functions• Monotone function: for all and for all values of ,
• Monotonicity is a natural property
• Monotone functions are bad for constructing error-correcting codes
Privacy of profile-based targeting
22
Approximate error-correcting codes• -approximate error-correcting code with distance :
function such that
• If less than fraction of is corrupted then we can reconstruct within fraction of bits.
• We need -approximate error-correcting code with constant distance.
Privacy of profile-based targeting
blatant non-privacy
23
Noise sensitivity• Noise sensitivity of function :
where is chosen uniformly at random, is formed by flipping each bit of with probability .
• If is big
• If is small
Privacy of profile-based targeting
1 0 1 … 0 1
1 1 0 1 … 0 1 0 1
𝑓
hidden bits
public bits
1 1 1 … 0 0
1 0 0 0 … 1 1 1 1
1 0 1 … 0 1
1 1 0 1 … 0 1 0 1
𝑓
hidden bits
public bits
1 1 1 … 0 0
1 1 1 1 … 0 0 0 1
24
Monotone functions• There exist highly sensitive monotone functions [MO’03].• Theorem: there exists monotone -approximate error-
correcting code with constant distance on average. • Idea of proof: Let be random independent monotone
boolean functions, such that and depends only on bits of .
• Let .• With high probability for random there is no such that
and• For Talagrand -approximate error-correcting code with
constant distance on average.
Privacy of profile-based targeting
25Privacy of profile-based targeting
Hmmm. Does smart noise ever work?
If the model is monotone, blatant non-privacy is still
possible
Advertising company Privacy researcher
26
Linear threshold model• Function is a linear threshold function, if there exist real
numbers ’s such that
• Theorem [Peres’04]: Let be a linear threshold function, then
No o(1)-approximate error-correcting
code with O(1) distance
Privacy of profile-based targeting
27
Conclusion• Two separate issues with input perturbation:
• Sparseness• Dependencies
• “Smart noise” hypothesis:Even for a publicly known, relatively simple model, constant corruption of profiles may lead to blatant non-privacy.
• Connection between noise sensitivity of boolean functions and privacy
• Open questions:• Linear threshold privacy-preserving mechanism?• Existence of interactive privacy-preserving solutions?
Privacy of profile-based targeting
ArbitraryMonotoneLinear thresholdfallacy
28
Thank for your attention!
Special thanks for Cynthia Dwork, Moises Goldszmidt, Parikshit Gopalan, Frank McSherry, Moni Naor, Kunal Talwar, and Sergey Yekhanin.
Privacy of profile-based targeting