Top Banner
the effect of correlation coefficients on communities of recommenders neal lathia, stephen hailes, licia capra department of computer science university college london [email protected] ACM SAC TRECK, Fortaleza, Brazil: March 2008 Trust, Recommendations, Evidence and other Collaboration Know-how
29
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SAC TRECK 2008

the effect of correlation coefficients on

communities of recommenders

neal lathia, stephen hailes, licia capradepartment of computer science

university college london

[email protected]

ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration

Know-how

Page 2: SAC TRECK 2008

recommender systems:

built on collaboration between users

Page 3: SAC TRECK 2008

collaborative filtering research design

methodsto solve problems

1. accuracy, coverage

2. data sparsity, cold-start

3. incorporating tag knowledge

for example,

Page 4: SAC TRECK 2008

… a method to classify content correctly

data predictedratingsintelligent

process

our focus: k-nearest neighbours (kNN)

Page 5: SAC TRECK 2008

how do we model kNN collaborative filtering?

Page 6: SAC TRECK 2008

a graph of cooperating users

me

nodes = userslinks = weighted according to similarity

Page 7: SAC TRECK 2008

accuracy, coverage

to answer this question, we need to find the optimal weighting:

the best similarity measure for the dataset, from the many available:

ba

ba

baRR

RRw ,

2

,

2

,

,,,

bibaia

bibaiaba

rrrr

rrrrw

2

,

2

,

,,,

1

bibaia

bibaiaba

rrrr

rrrr

Nw

and there are more still…

2

,2

,

,,,

5.25.2

5.25.2

ibia

ibiaba

rr

rrw

Page 8: SAC TRECK 2008

concordance: proportion of agreement

TN

DCw ba

,

+0.5 +3.0

-1.5+1.5

+1.5 +/-?

concordant

discordant

tied

Somers’ d}

Page 9: SAC TRECK 2008

community view of the graph:

-0.430.57

(a very small example)

me-0.50

-0.65

0.12

0.87

0.010.57

0.840.220.99

0.82

0.23

0.39

0.11

0.68

0.02

0.41 0.01

-0.99

0.78

Page 10: SAC TRECK 2008

or, put another way:

-0.430.57

(a very small example)

me

good

bad

none

good

good

goodgood

none

nonegood

bad

bad

good

good

good

good

nonegood

good

Page 11: SAC TRECK 2008

what is the best way of generating the graph?

Page 12: SAC TRECK 2008

like this?

-0.430.57

(a very small example)

me

good

bad

none

none

good

badbad

good

goodgood

good

good

bad

none

none

good

nonebad

bad

Page 13: SAC TRECK 2008

or like this?

-0.430.57

(a very small example)

megood

bad

none

good

good

good

good

none

nonebad

bad

bad

good

good

good

good

none

good

good

Page 14: SAC TRECK 2008

similarity values depend on the method used:

there is no agreement between measures

[2][3][1][5][3]

[4][1][3][2][3]

my profile neighbour profile

pearson -0.50weighted- pearson -0.05cosine angle0.76co-rated proportion1.00concordance -0.06

badnear zero

goodvery goodnear zero

Page 15: SAC TRECK 2008

nodes = userslinks = weighted according to similarity

each method will change the distribution of similarity across the graph

Page 16: SAC TRECK 2008

… the pearson distribution

intelligent process

Pearson Distribution

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

(-1.0

,-0.9

5)

(-0.9

,-0.8

5)

(-0.8

,-0.7

5)

(-0.7

,-0.6

5)

(-0.6

,-0.5

5)

(-0.5

,-0.4

5)

(-0.4

,-0.3

5)

(-0.3

,-0.2

5)

(-0.2

,-0.1

5)

(-0.1

,-0.0

5)

(0.0,

0.05

)

(0.1,

0.15

)

(0.2,

0.25

)

(0.3,

0.35

)

(0.4,

0.45

)

(0.5,

0.55

)

(0.6,

0.65

)

(0.7,

0.75

)

(0.8,

0.85

)

(0.9,

0.95

)

Range

Pro

po

rtio

n

Page 17: SAC TRECK 2008

… the modified pearson distributionsweighted-PCC, constrained-PCC

Modified Pearson Distributions

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

(-1.0

,-0.9

5)

(-0.9

,-0.8

5)

(-0.8

,-0.7

5)

(-0.7

,-0.6

5)

(-0.6

,-0.5

5)

(-0.5

,-0.4

5)

(-0.4

,-0.3

5)

(-0.3

,-0.2

5)

(-0.2

,-0.1

5)

(-0.1

,-0.0

5)

(0.0,

0.05

)

(0.1,

0.15

)

(0.2,

0.25

)

(0.3,

0.35

)

(0.4,

0.45

)

(0.5,

0.55

)

(0.6,

0.65

)

(0.7,

0.75

)

(0.8,

0.85

)

(0.9,

0.95

)

Range

Pro

po

rtio

n

Weighted-PCC Constrained-PCC

Page 18: SAC TRECK 2008

… and other measures

intelligent process

Other Distributions

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(-1.0

,-0.9

5)

(-0.9

,-0.8

5)

(-0.8

,-0.7

5)

(-0.7

,-0.6

5)

(-0.6

,-0.5

5)

(-0.5

,-0.4

5)

(-0.4

,-0.3

5)

(-0.3

,-0.2

5)

(-0.2

,-0.1

5)

(-0.1

5,-0

.1)

(-0.0

5,0.0

)

(0.05

,0.1)

(0.15

,0.2)

(0.25

,0.3)

(0.35

,0.4)

(0.45

,0.5)

(0.55

,0.6)

(0.65

,0.7)

(0.75

,0.8)

(0.85

,0.9)

(0.95

,1.0)

Range

Pro

po

rtio

n

Co-Rated Somers VSS

somers’ d, co-rated, cosine angle

Page 19: SAC TRECK 2008

an experiment withrandom numbers

Page 20: SAC TRECK 2008

what happens if we do this?

me

java.util.Random r = new java.util.Random()

for all neighbours i {

similarity(i) = (r.nextDouble()*2.0)-1.0);

}

Page 21: SAC TRECK 2008

Neighborhood Co Rated Somers’ d PCC wPCC R(0.5, 1.0) Constant(1.0) R(-1.0, 1.0)

1 0.9449 0.9492 1.1150 0.9596 1.0665 1.0406 1.0341

10 0.8498 0.8355 1.0455 0.8277 0.9595 0.9495 0.9689

30 0.7979 0.7931 0.9464 0.7847 0.8903 0.9108 0.8848

50 0.7852 0.7817 0.9007 0.7733 0.8584 0.8922 0.8498

100 0.7759 0.7728 0.8136 0.7647 0.8222 0.8511 0.8153

153 0.7726 0.7727 0.7817 0.7638 0.8053 0.8243 0.8024

229 0.7717 0.7771 0.7716 0.7679 0.7919 0.7992 0.8058

459 0.7718 0.7992 0.8073 0.8025 0.7773 0.7769 0.7811

N

prMAE

iaia ,,accuracy

…cross-validation results in paper

movielens u1 subset…

Page 22: SAC TRECK 2008

sprediction#

sprediction uncovered#Coveragecoverage

…cross-validation results in paper

movielens u1 subset…

Neighborhood Co Rated Somers’ d PCC wPCC Oracle

1 0.67795 0.57165 0.96725 0.61375 0.00495

10 0.15455 0.0999 0.80515 0.1114 0.00495

30 0.0512 0.0407 0.57225 0.04135 0.00495

50 0.03065 0.0266 0.3641 0.0251 0.00495

100 0.01515 0.01645 0.08345 0.01485 0.00495

153 0.00945 0.0122 0.0273 0.01135 0.00495

229 0.00715 0.00965 0.01165 0.00915 0.00495

459 0.00495 0.0054 0.00495 0.00495 0.00495

(best coverage when all of community used)

Page 23: SAC TRECK 2008

why do we get these results?

Page 24: SAC TRECK 2008

a) our error measures are not good

enough?

N

rpMAE

iaia ,,

sprediction#

sprediction uncovered#Coverage

J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004.

S.M. McNee, J. Riedl, and J.A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006.

N

prRMSE iaia

2

,,

Page 25: SAC TRECK 2008

b) is there something wrong with the dataset?

Page 26: SAC TRECK 2008

c) is user-similarity not strong enough to capture the best recommender relationships in

the graph?

Page 27: SAC TRECK 2008

one proposal…

N. Lathia, S. Hailes, L. Capra. Trust-Based Collaborative Filtering. To appear In IFIPTM 2008: Joint iTrust and PST Conferences on Privacy, Trust management and Security. Trondheim, Norway. June 2008.

is modelling filtering as a trust-management problem a potential solution?

once we do that, more questions arise…

Page 28: SAC TRECK 2008

what other graph properties emerge from kNN collaborative filtering?

how does the graph evolve over time?

current work

N. Lathia, S. Hailes, L. Capra. Evolving Communities of Recommenders: A Temporal Evaluation. Research Note RN/08/01, Department of Computer Science, University College London. Under Submission.

N. Lathia, S. Hailes, L. Capra. kNN User Filtering: A Temporal Implicit Social Network. Current Work.

Page 29: SAC TRECK 2008

read more: http://mobblog.cs.ucl.ac.uktrust, recommendations, …

neal lathia, stephen hailes, licia capradepartment of computer science

university college london

[email protected]

ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration Know-how

questions?