Top Banner
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong Sun School of Computer Science and Engineering Beihang University Beijing, China
32

1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

Dec 16, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

1

RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web

Service Recommendation

Xi Chen, Xudong Liu, Zicheng Huang, and Hailong Sun

School of Computer Science and EngineeringBeihang University

Beijing, China

Page 2: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

2

Outline

• Introduction

• Motivation

• RegionKNN Algorithm

• Experiments

• Conclusion and Future Work

Page 3: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

3

1. Introduction

Page 4: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

4

Introduction

• Current situation– More than 25,000 public available services (seekda.com)– About 200,000 related documents

• Goal of service recommendation– Optimal QoS– User preference

• Current method: Collaborative Filtering (CF) – predict and recommend the potential favorite items for a

particular user by using rating data collected from similar users.

• If Alice and Bob both like X and Alice likes Y then Bob is more likely to like Y

• Problems– Characteristics of QoS are neglected– Online performance need to be improved

Page 5: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

5

2. Motivation

Page 6: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

6

A Motivating Scenario

Some QoS properties (e.g. availability, response time) highly correlate to users’ physical locations.

EmailFiltering

WS

EmailFiltering

WS

Page 7: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

7

3. RegionKNN Algorithm

Page 8: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

8

What’s RegionKNN• Hybrid CF Algorithm

– recommend web services with optimal QoS to the active user with consideration of the region factor

• Two phases of RegionKNN– Region model building (offline)

• Region-sensitive services identification

• Region aggregation

– Service recommendation (online) (modified KNN) • Neighbor selection

• QoS Prediction

I take response time/round trip time (RTT) as an example to describe our algorithm

Page 9: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

9

3.1 Region model

Page 10: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

10

Region Model• Region

– a group of users who are closely located with each other and have similar RTT values

u5

u19 u2u22,u8

u1, u3

Service A Service A

Service B Service B

Service X Service X

Page 11: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

11

Input Dataset

• User-Service RTT Matrix: m services, n users

• The set of non-zero RTTs of service s {R1(s), R2(s),…, Rk(s)} collected from all users is a sample from population R.

s1 s2 … sm

u1 0 245 … 20078

u2 2023 342 … 539

… … … … …

un 0 3040 … 498

RTT is much longer than

others

Page 12: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

12

Region-sensitive Services Identification

• To estimate the mean μ and the standard deviation σ of R, we use:

))((ˆ sRmedian ii

))((4862.1ˆ sRMAD ii

Median: the numeric value separating the higher half of a sample from the lower half. e.g. {120, 128, 200, 250, 258, 2000, 3500} median = 250

MAD: the Median of the Absolute Deviations from the sample's median. e.g. {120, 128, 200, 250, 258, 2000, 3500} {8, 50, 122, 130, 1750, 2250} MAD = 130

Page 13: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

13

• Region-Sensitive Service– Let R = {R1(s), R2(s),…, Rk(s)} be the set of RTTs of service s

provided by users from all regions. Service s is a sensitive service to region M iff

))()ˆ3ˆ)((()( MjregionsRRsR jj

{120, 128, 200, 250, 258, 2000, 3500}

u1 u3 u5 u19 u2 u22 u8

u5

u19 u2

u22

u8u1, u3

Service A

Service A

Region-sensitive services Identification

Page 14: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

14

Definition

• Region Sensitivity

• Sensitive Region– Region M is a sensitive region iff regSen >λ.

• Region center – the median vector of all the RTT vectors provided by

users in a region

||

||)(

services

ervicessensitivesMregSen

Page 15: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

15

Region Aggregation

• Why?– Users only provide limited number of QoS values, the

sparse dataset always leads to poor recommendation.

• How?– It treats users with similar IP addresses as a region at

the outset – In each iteration, the two most similar and non-sensitive

regions are selected and aggregated, if their similarity exceeds threshold μ.

– It executes at most N-1 steps (N is the number of regions at the outset), in case that all regions are non-sensitive, extremely correlates to each other and finally aggregates into one region.

Page 16: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

16

Region Similarity• The similarity between region M and N is measured by

the similarity of the two centers. • Similarity by Pearson Correlation Coefficient (PCC)

)()(

2

)()(

2

)()(

))(())((

))(())((

),(

mSnSsnn

mSnSsmm

nnmSnSs

mm

RsRRsR

RsRRsR

nmSim

s1 s2 s3 s4 s5

cm 1 2 5 0 0

cn 0 0 5 1 3

By PCC, the similarity is of the two regions is 1

Page 17: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

17

Region Similarity

• PCC often overestimates the similarity when the two regions have few co-invoked services. To adjust it, we use:

),(|)()(|

|)()(|),( nmSim

nSmS

nSmSnmmSi

s1 s2 s3 s4 s5

cm 1 2 5 0 0

cn 0 0 5 1 3

By adjustment, the similarity of the two regions is 0.2

Page 18: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

18

3.2 Service Recommendation

Page 19: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

19

Neighbor Selection

• Neighbors: users with similar QoS experiences• Advantages of region-based neighbor selection

– Do not need to search the entire dataset, thousands of users are clustered into a certain number of regions

– The feature of the group of users in a region is represented by the region center

Page 20: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

20

QoS Prediction

• To calculate the RTT prediction for the active user u and service si

• Get the active user’s IP address and find the region the user belongs to. If no appropriate region is found, the active user will be treated as a member of a new region.

• Identify whether service si is sensitive to the specific region. If it is region-sensitive, then the prediction is generated from the region center:

)(ˆ iu sR

)()(ˆ icenteriu sRsR

Page 21: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

21

QoS Prediction (cont.)

• Otherwise, use adjusted PCC to compute the similarity between the active user and each region center that has evaluated service si, and find up to k most similar centers {c1, c2,…, ck}.

• If the active user’s region center has the RTT value of si, i.e. , the prediction is computed using the equation:

0)( icenter sR

k

j j

k

j jcic

icenteriucumSi

cumSiRsRsRsR jj

1

1

),(

),())(()()(ˆ

Page 22: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

22

QoS Prediction (cont.)• Otherwise,

• Previous CF-based web service recommendation

algorithms use the following equation, to predict the missing QoS value.

• This equation is based on the assumption that each user’s rating range is subjective and comparatively fixed, while it

is not applicable in our context.

k

j j

k

j jic

iucumSi

cumSisRsR j

1

1

),(

),()()(ˆ

k

j j

k

j jcic

uiucumSi

cumSiRsRRsR jj

1

1

),(

),())(()(ˆ

Page 23: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

23

Time complexity

• Model building (offline)– The time complexity of region aggregation algorithm

is O(N2logN), and N is the number of regions at the outset.

• QoS prediction (online)– Let l be the number of regions, m the number of web

services, and n the number of users. In the online part, O(l) similarity weight calculations are needed, each of which takes O(m) time. Therefore, the online time-complexity is O(lm)≈O(m). Previous user-based CF algorithm has O(mn) online time complexity.

Page 24: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

24

4. Experiments

Page 25: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

25

Experiments

• Dataset– a subset of WSRec with 300,000 RTT records– 3000 users – 100 services

• Evaluation Metric

– Ru(s) denotes the actual RTT of web service s given by user u

– denotes the predicted one– L denotes the number of tested services

L

sRsR

MAE suuu

,

|)(ˆ)(|

)(ˆ sRu

Dataset: http://www.wsdream.net

Page 26: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

26

MAE Performance

Page 27: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

27

Impact of λ and μ

Page 28: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

28

Impact of neighborhood size K

Page 29: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

29

Impact of Data Sparsity

Page 30: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

30

5.Conclustion and Future Work

Page 31: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

31

Conclusion and Future Work

• Conclusion– a new region model for clustering users and

identifying region-sensitive web services– a hybrid model-based and memory-based CF

algorithm for web service recommendation, which significantly improves the recommendation accuracy

– We demonstrate RegionKNN’s scalability advantage over traditional CF algorithms via time-complexity analysis

• Future Work– Investigation of more QoS properties and their

variation with time – Internal relations between QoS properties

Page 32: 1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

32