Top Banner
Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike June 29th, 2009 HT 2009, Workshop “Web 3.0: Merging Semantic Web and Social Web” Dr. Peter Brusilovsky, Associate Professor Denis Parra, PhD Student School of Information Sciences University of Pittsburgh
16

Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Jul 15, 2015

Download

Education

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

June 29th, 2009

HT 2009, Workshop “Web 3.0: Merging Semantic Web and Social Web”

Dr. Peter Brusilovsky, Associate ProfessorDenis Parra, PhD StudentSchool of Information SciencesUniversity of Pittsburgh

Page 2: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Outline

• Motivation• Methods

– CCF– NwCF– BM25

• The Study• Description of the Data• Results• Conclusions

Page 3: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

MotivationBased on information available on CiteULike : Develop user-centered recommendations of

scientific articles. Investigate the potential of users’ tags in

collaborative tagging systems to provide recommendations.

Compare the accuracy of user-based collaborative filtering methods.

Why CiteULike? Popular collaborative tagging system more topic-

oriented than delicious: article references. Familiarity with the system.

Page 4: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

CiteULike

Page 5: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: CCF (1 / 2)• Classic Collaborative Filtering (CCF): user-based

recommendations, using Pearson Correlation (users’ similarity) and adjusted ratings to rank items to recommend [1]

∑∑∑

⊂⊂

−−

−−=

nunu

nu

CRi nniCRi uui

CRi nniuui

rrrr

rrrrnuuserSim

,,

,

22 )()(

))((),(

∑∑

⊂−⋅

+=)(

)(

),(

)(),(),(

uneighborsn

uneighborsn nni

unuuserSim

rrnuuserSimriupred

Page 6: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: CCF (2 / 2)

3

4

1

4

4

1

1

3

3

2

5

3

4

2

1

3

2

2

53

3

2

Page 7: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: NwCF (1 / 2)• Neighbor weighted Collaborative Filtering

(NwCF): Similar to CCF, yet incorporates the “amount of neighbors rating an item” in the ranking formula of recommended items

∑∑∑

⊂⊂

−−

−−=

nunu

nu

CRi nniCRi uui

CRi nniuui

rrrr

rrrrnuuserSim

,,

,

22 )()(

))((),(

),())(1(log),( 10 iupredinbriudpre ⋅+=′

Page 8: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: NwCF (2 / 2)

3

4

1

4

4

1

1

3

3

2

5

3

4

2

1

3

2

2

53

3

2

Page 9: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: BM25 (1 / 2)

• BM25: We obtain the similarity between users (neighbors) using their set of tags as “documents” and performing an Okapi BM25 (probabilistic IR model) Retrieval Status Value [2] calculation.

),())(1(log),( 10 iupredinbriudpre ⋅+=′

∑∈ +

+⋅

+×+−+

⋅=qt tq

tq

tdaved

tdd tfk

tfk

tfLLbbk

tfkIDFRSV

3

3

1

1)1(

))/()1((

)1(

Page 10: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Methods: BM25 (2 / 2)

Query terms Doc_1 Doc_2 Doc_3

Page 11: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

The Study

• 7 subjects• To each subject, four lists of 10

recommendations (each list) were created (CCF, NwCF, BM25_10, BM25_20)

• The four lists were combined and sorted randomly (due to overlapping of recommendations, less than 40 items)

• Subjects were asked to evaluate relevance (relevant/somewhat relevant/not relevant) and novelty (novel/ somewhat novel/ not novel)

Page 12: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Description of the Data

Crawl CUL for 20 “center users” (only 7 were used for the study)

Annotation: tuple {user, article, tag}

Item # of unique instances

users 358articles 186,122tags 51,903

annotations 902,711

Page 13: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Results

(a) nDCG (b) Average Novelty (c) Precision_2@5

(d) Precision_2@10 (e) Precision_2_1@5 (f) Precision_2_1@10

Page 14: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Conclusions

• The rating scale must be considered carefully in a CF approach.

• NwCF, which incorporates the number of raters, decreases the uncertainty produced by items with too few ratings.

• The tag-based user similarity approach shows interesting results, which can lead us to consider it a valid approach to Pearson-correlation when using CF algorithms.

• We will incorporate more users in our future studies to make the results more conclusive.

Page 15: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Questions?

Page 16: Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

Bibliography

• [1] Schafer, J., Frankowski, D., Herlocker, J. and Sen, S. 2007 Collaborative Filtering Recommender Systems. The Adaptive Web. (May 2007), 291-324.

• [2] Manning, C., Raghavan, P. and Schutze, H. 2008 Introduction to Information Retrieval. Cambridge University Press.