Fake Co-visitation Injection Attacks to Recommender Systems
Guolei Yang, Neil Zhenqiang Gong, Ying Cai
Co-visitation Recommender System is Popular
We show co-visitation recommender systems can be spoofed to recommend items as an attacker desires
Brief Intro to Co-visitation Recommender System
l Key idea: Items that are frequently visited together in the past are likely to be visited together in the future
Video AVideo B
View
Video A
Video B
In the past
ViewRecommend
videos
Show
Key Data Structure: Co-visitation Graph
Item 1 Item 2
Item 3 Item 4
Item 5
Each vertex represents an item
Key Data Structure: Co-visitation Graph
Item 1 Item 2
Number of views of Item 1 (Popularity)
Item 3 Item 4
Item 5
Key Data Structure: Co-visitation Graph
Item 1 Item 2
Number of co-visitations between Item 1 and 2
Item 3 Item 4
Item 5
Two Recommendation Tasks
Item-to-Item Recommendation
Compute item-item similarity
Rank items by similarity
Generate recommendation
list
Item 1 Item 2
Item 3 Item 4
Item 5
1. Item 2 2. Item 4 3. Item 3
View
Recommend items
Item 1
Item 2
𝑓(𝑤↓𝑖 , 𝑤↓𝑗 )= 𝑤↓𝑖 ∗𝑤↓𝑗 e.g., On YouTube
Include items
1) with high similarity
2) satisfy popularity threshold
For Item 1:
Related Work
l Xing et al. (USENIX Security’13) proposed pollution attacks to the user-to-item recommendation n Relies on Cross-Site Request Forgery (CSRF) n Not applicable to item-to-item recommendation
l Profile injection (Shilling) attacks to recommender systems via user-item rating matrices n Not applicable to co-visitation recommender systems which do not rely on user-item rating
matrix.
l Relationship to adversarial machine learning l Our attack is data poisoning attack to recommender systems
Roadmap
l Threat model
l Proposed attacks
l Evaluations on synthetic data
l Evaluations on real-world recommender systems
l Countermeasures
Threat Model
l Attacker’s background knowledge
Recommendation
ListsCo-visitation
Graph
Popularity Threshold
Item Popularity
Recommendation Lists
High knowledge Medium knowledge Low knowledge
Knowledge
Scenario Insider YouTube … Amazon, eBay…
l Attacker’s goal n User Impression (UI) : The probability that a random visitor will see the item n Increase UI of a target item n Decrease UI of a target item
Proposed Attacks
l Promotion attack n Goal: Increase UI of a Target Item n Make the target Item appear in the recommendation lists of as many items
as possible
Target Item
Recommend items
Recommend items
Recommend items
View
+
+
+
Proposed Attacks
l Promotion attack n Goal: Increase UI of a Target Item n Make the target Item appear in the recommendation lists of as many items
as possible
Target Item
Recommend items
Recommend items
Recommend items
+
+
+
Proposed Attacks
l Promotion attack n Goal: Increase UI of a Target Item n Make the target Item appear in the recommendation lists of as many items
as possible
Target Item
Recommend items
Recommend items
Recommend items
+
+
+
Anchor Item
Anchor Item
Anchor Item
Proposed Attacks
l Demotion attack n Goal: Decrease UI of a Target Item n Remove the target Item from the recommendation lists of as many items as
possible
Target Item
Recommend items
Recommend items
Recommend items
Anchor Item
Anchor Item
Anchor Item
X
X
X
Key Challenge
l Given a target item and a limited number fake co-visitations n How to select the anchor item(s) to attack? n How many fake co-visitations to insert for each anchor item?
Key Challenge
l Given a target item n How to select the anchor item(s) to attack? n How many fake co-visitations to insert for each anchor item?
l Solution: Formulate the attack as an optimization problem n Select the best anchor items to attack n Determine how many fake co-visitation is needed to attack each anchor
Promotion Attack – High Knowledge Attacker
Item 1 Item 2
Original Co-visitation graph
Item 3
Attacker’s Goal: Promote Item 3
Item 5
Select anchor items
Recommend items
Item 2
Recommend items
Item 2
Item 4
Promotion Attack – High Knowledge Attacker
Item 1 Item 2
Insert 10 fake co-visitations of Item1 & 3
14
Item 3 21
32
Attacked Co-visitation graph
Item 4
Item 5
Attacker’s Goal: Promote Item 3
Item 1 Item 2
14
Item 3 31
32
17 45
Insert 10 fake co-visitations of Item 3 & 4
Item 4
Item 5
Attacker’s Goal: Promote Item 3
Attacked Co-visitation graph
Promotion Attack – High Knowledge Attacker
Item 1 Item 2
14
Item 3 31
32
17 45 Item 4
Item 5
Attacker’s Goal: Promote Item 3
Attacked Co-visitation graph
Promotion Attack – High Knowledge Attacker
Recommend items
Item 3
Recommend items
Item 3
Attacker’s Goal: Promote Item 3
Promotion Attack – High Knowledge Attacker
Item 1 Item 2
14
Item 3 31
32
17 45 Item 4
Item 5
Attacked Co-visitation graph
Recommend items
Item 3
Recommend items
Item 3
l Medium knowledge attacker can be converted into high knowledge attacker by estimating edge weight
l Low knowledge attacker can be converted into medium knowledge attacker by estimating vertex weight
Attacker’s Goal: Demote Item 4
Demotion Attack – High Knowledge Attacker
Item 1 Item 2
Original Co-visitation graph
Item 3 Item 4
Item 5
Recommend items
Item 4
Recommend items
Item 4
Recommend items
Item 4
Attacker’s Goal: Demote Item 4
Demotion Attack – High Knowledge Attacker
Item 1 Item 2
Attacked Co-visitation graph
Item 3 Item 4
Item 5
15
35
21 9
26
15
Recommend items
Item 1
Recommend items
Item 5
Recommend items
Item 2
Evaluation on Synthetic Data
l Question we aim to answer n How does attacker’s background knowledge impact our attacks n How does the co-visitation graph structure impact our attacks? n How does the number of inserted fake co-visitations impact our attacks?
Impact of Attacker’s Background Knowledge
Impact of Co-visitation Graph Structure
Impact of Number of Fake Co-visitations
Evaluation on Real-World Recommender Systems
Initialization Select anchor items
Insert fake co-visitations
Exam results
Repeated for approx. 21 days
48 ~ 72 hours
Results on YouTube
Results on YouTube
Countermeasures
l Limiting background knowledge n The website can discretize item popularities
Funny Video
3827 Views
Funny Video
3500+ Views
Funny Video
2000+ Views
Shows exact popularity
Discretize Granularity = 500
Discretize Granularity = 2000
Countermeasures
l Limiting background knowledge n The website can discretize item popularities
Conclusion
l Recommender systems are vulnerable to Fake Co-visitation Injection Attacks
l An attacker can use our attacks to spoof a recommender system to make recommendations as the attacker desires.
l Convert medium/low knowledge attackers into high knowledge attacker l The missing knowledge is estimated based on publically available information
Parameter Estimation
?
?
?
? ?
?
?
Insert a fake item as probe
Insert co-visitations until it appears in the recommendation list
of an item
Fake Item
l Convert medium/low knowledge attackers into high knowledge attacker l The missing knowledge is estimated based on publically available information
Parameter Estimation
>= 6
?
? ?
?
?
Insert a fake item as probe
Insert co-visitations until it appears in the recommendation list
of an item
6
>= 6
>= 6
l General steps
Proposed Attack Algorithm
Initialization Select items to attack
Insert fake co-visitations
Exam results
Knowledge acquire
Parameters estimation
Goal achieved?
Terminate
Construct & solve the
optimization problem
Repeatedly view selected items in the
same browser session
No
Yes
l Results on YouTube
Experiments on Real-world Recommder Systems
l Results on eBay
Experiments on Real-world Recommder Systems
l Results on Amazon
l Results on Yelp
Experiments on Real-world Recommder Systems
l Results on LinkedIn
Experiments on Real-world Recommder Systems
Countermeasures
l Limiting fake co-visitations n Use CAPTCHA
n Fake co-visitation detection
n Using co-visitations from registered users only