Top Banner
A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University
22

A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Mar 28, 2015

Download

Documents

Karina Mileham
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

A Probabilistic Approach to Personalized Tag Recommendation

Meiqun Hu, Ee-Peng Lim and Jing Jiang

School of Information SystemsSingapore Management University

Page 2: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Social Tagging

• Social tagging allows users to annotate resources with tags.– organize

• tags are keywords, serving as (personalized) index terms that group relevant resources

– store• online storage gives mobility and convenience to access

– share• published bookmarks can be viewed by other users

– explore• to leverage collective wisdom to find interesting resources

Image credit @ logorunner.com

Page 3: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Personalized Tag Recommendation

• Personalized tag recommendation aims to recommend tags to the query user for annotating the query resource.

• Recommendation eases the tagging process.

?

Page 4: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Why Personalize Recommendations?

• Tag recommendation should be personalized.– users exhibit individualized choice of tag terms• e.g., language preference

– personalized index for personal consumption and consistency

Page 5: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Problem Formulation and A Basic Method

• Problem Formulation: p(t|rq,uq)

• A Basic Method: freq-r, to recommend top frequent tags– assuming that the more people have used this tag, the

more likely it will be used again– current state-of-the-art in many social tagging sites,

e.g., – fails to personalize the recommendations for the

query user

Page 6: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Three Scenarios

Scenario 1: ‘foto’ is an infrequent tag for the resource.

Scenario 2: ‘foto’ is has not been used for the resource, but has been used by the user for annotating other resources in the past.

Scenario 3: ‘foto’ has not been used for the resource but has been used by others when annotating other resources.

Page 7: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Collaborative Filtering Method

• A Method based on Collaborative Filtering: knn, to select top k-nearest neighbors and recommend tags used by these neighbors for annotating the resource– assuming that there are like-minded users who

have annotated the same resource– classic collaborative filtering, without ratings– addresses scenario 1, but– fails scenario 2,3

Page 8: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Personomy Translation Method

• To translate the resources tags to the user’s personal tags (trans-u)– to learn p(‘foto’|uq, ‘photo’)

– addresses scenario 2, but– fails scenario 3, if uq has never used ‘foto’

Page 9: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

To Address Scenario 3

borrow translation

Page 10: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

A PROBABILISTIC FRAMEWORK

1. Personomy Translation2. A Framework3. Measuring User Similarity

Page 11: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Borrowing Translations

• To learn p(‘foto’|u,‘photo’) and sim(u,uq)

borrow translation

Page 12: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Personomy Translation

• To learn p(‘foto’|uq,‘photo’)

[Wetzker et al. 2009]

Page 13: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Measuring Similarity between Users

• sim(u,uq)– assuming that users are similar if they perform

similar translations– users are profiled by sets of translation

probabilities, e.g.,p(‘foto’|u,‘photo’),…, p(‘image’|u,‘photo’)p(‘netz’|u,‘web’),…, p(‘internet’|u,‘web’)

– we adopt distributional divergence to measure (dis)similarity between users• JS-divergence, L1-norm, such as in [Lee 1997]

Page 14: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Distributional Divergence between Userssim(‘photo’)(u,uq)

sim(‘web’)(u,uq)

S sim(u,uq)

Page 15: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Remark on the 3 Scenarios

• This framework is able to address all three scenarios

– addresses scenario 1 by allowing self-translation, e.g., p(‘photo’|u,‘photo’)

– addresses scenario 2 by allowing self-similarity, e.g., sim(uq,uq)

– addresses scenario 3 by enabling borrowed translations

Page 16: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

EXPERIMENTS

1. Data Collection2. Experimental Setup3. Recommendation Performance

Page 17: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Dataset from BibSonomytrain validation test

time frame start ~ DEC-08 JAN 09 ~ JUL 09 JUL 09 ~ DEC 09

|R| 22,389 667 258

|U| 1,185 136 57

|T| 13,276 862 525

|A| 253,615 2,604 1,262

|P| 64,120 775 279

average posts per user 53.695 5.699 4.895

average tag tokens per user 3.955 3.360 4.523

average distinct tags per user 61.833 13.191 14.667

Note:time order: train validation testusers in test set must have been appeared in validation set.

Page 18: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Experimental Setup

• Methods to compare– trans-n1, trans-n2

• k: {5,10,20,50,100,200,300,400,500}

• js-divergence, l1-norm• b: {1,2,4,8} for js-divergenceb: {1,2,4,8,12,16} for l1-norm

– trans-u1, trans-u2– knn-ur, knn-ut

• k: {5,10,20,50,100,200,300,400,500}

– interpolating with freq-r

• Evaluation metric– pr-curve at top 5

– macro-average for users

• Parameter optimization– macro-average f1@5

– global vs. individual settings

Page 19: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Recommendation PerformanceGlobal Setting

Page 20: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Recommendation PerformanceIndividual Setting

Page 21: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Recommendation Case Studyuser resource tags assigned top 5 recommendations

trans-u1 trans-n1

920 a45…57f 2008, bookmarking, folksonomy, social, spam, folksonomies, tagorapub, web20, 20, integpub, systems, tagger, web

diplomathesiscaptchafolksonomybackgroundcloselyrelatedfolksonomy

folksonomytaggingsocialweb20web

1119 d16…b50 it, news, technology, blog, feed, technologie

kulturonlineradiokunstcd

newsweb20blogsoftwaretechnology

3217 467…655 annotation, ontology, knowledge, semantic

sqlerdeclipse

taggingfolksonomyontologyweb20semantic

scenario 3 tags

Page 22: A Probabilistic Approach to Personalized Tag Recommendation Meiqun Hu, Ee-Peng Lim and Jing Jiang School of Information Systems Singapore Management University.

Conclusion

• We propose a probabilistic framework for solving the personalized tag recommendation task, which incorporate personomy translation and borrowing translation from neighbors.

• We devise to use distributional divergence to measure similarity between users. Users are similar if they exhibit similar translation behavior.

• We find the proposed methods give superior performance than translation by the query user only and classic collaborative filtering.