Dr. Guandong Xu Intelligent Web and Information system Department of Computer Science Aalborg University The Research Progress of Recommender Systems in.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Slide 1
Dr. Guandong Xu Intelligent Web and Information system
Department of Computer Science Aalborg University The Research
Progress of Recommender Systems in Social Tagging Systems
Slide 2
Outline Why recommender systems The state-of-the-art of
recommender systems Social tagging systems Tag-based recommender
system Personalized recommendation Tag recommendation User
profiling Open research questions Conclusion Appendix: Our recent
work on group approaches
Slide 3
Why recommender systems The Internet computing era Information
overload Low precision: retrieved info is not what you need Low
recall: the correctly relevant info is not exhaustively returned
Example:
Slide 4
Why recommender systems No personalized Different users
returned the same search results Personalization or recommendation
Same results
Slide 5
Why Recommender System GroupLens: An open architecture for
collaborative filtering of netnews, Resnick, P.; Iacovou, N.;
Sushak, M.; Bergstrom, P.; Riedl, J., 1994 ACM Conference on
Computer Supported
Slide 6
Outline Why recommender systems The state-of-the-art of
recommender systems Social tagging systems Tag-based recommender
system Personalized recommendation Tag recommendation User
profiling Open research questions Conclusion Appendix: Our recent
work on group approaches
Slide 7
Why Recommender Systems Recommender systems recommendation
systems recommendation engines users recommend information items
(films, television, video on demand, music, book, news, images, web
pages, etc) information filtering system technique Interested in
Content-based approach, Collaborative filtering approach
Information
Slide 8
Tradition Recommendation Methodology CategoriesMethodology
Content-based TF-IDF Bayesian classifiers ML(Clustering, decision
tree, Artificial neural networks) Collaborative recommendation
Memory-based Model-based (K-means clustering, Bayesian model,
probability relational model, liner regression) Hybrid Adding
Content-Based Characteristics to Collaborative Models Adding
Collaborative Characteristics to Content-Based Models A single
Unifying Recommendation Model
Slide 9
Recommender System Categories Content-based recommendations
Collaborative recommendations Hybrid approaches User Similar items
User1 User2 Similar taste items recommend Preferred Content-based
&Collaborative Preferredrecommend User Preferred
Slide 10
Example for Content-based approach Considering a recommendation
scenario Page 1: Department of Computer Science at Aalborg
University. Page 2: Department of Health Science and Technology
Search queue: Computer Science R1={Department, Computer Science,
Aalborg University,} R2={Department", "Health Science",
"Technology, } Use TF-IDF (term frequency inverse document
frequency) Result: R1
Slide 11
Example for Content-based approach Department of Computer
Science at Aalborg University Department of Health Science and
Technology R1={Department, Computer Science, Aalborg University,}
R1R2 R2={Department", "Health Science", "Technology, } Query
Computer Science TF-IDF (term frequency inverse document frequency)
Result: R1
Slide 12
Principle of Collaborative filtering Two kinds of approaches:
User-based: select the K similar users (KNN) or called memory-based
Item-based: select the closest item set or called model-based
Slide 13
Example for Collaborative Filtering Example 2 in Amazon.com:
The algorithm generates recommendations based on customers who
bought this book also brought other book (similar preferences to
the user).
Slide 14
Recommendations
Slide 15
Similarities and Differences Content-based recommendations
Collaborative Filtering recommendations Similarity Vectors of
TF-IDF weights Vectors of the actual user-specified ratings
Slide 16
Limitations CategoriesLimitations Content-based (By keywords)
Limited Content Analysis Over Specialization New User Problem
Collaborative recommendation New User Problem - cold start New Item
Problem cold start Sparsity Problem Hybrid N/A
Slide 17
Some Extending Capabilities (1/2) Comprehensive Understanding
of Users and Items Extensions for Model-Based Recommendation
Techniques Multidimensionality of Recommendations Extend
2-Dimensional to Multi-dimensional User Item User Item other1
other2 User, Movie, Time, Place
Slide 18
Some Extending Capabilities (2/2) Multi-criteria Ratings
Restaurant (food, dcor, service, price) Non-intrusiveness
Flexibility Users flexibility Effectiveness of Recommendations
Metrics related
Slide 19
Insights of recommender systems Closely look at recommender
systems from different perspectievs
Slide 20
What do with data - implementation Two kinds of problem with
data: Information retrieval (IR): static content, dynamic query
-> modeling content (organized with index) Information Filtering
(IF): dynamic content, static query -> modeling query (organized
as filters) Recommendation is between IR and IF since the content
varies slowly and the queries depend of few parameters. Methods of
both IR and IF are then used to reduce computation at query
time.
Slide 21
General purpose Top-k filtering: list of "best" items (main
usage) or anti-spam Items correlation: find similar items
Prediction of rating: predict any pair between any pair of an user
and an item (more general)
Slide 22
Degree of personalization Generic: everyone receives same
recommendations Demographic: everyone in the same category receives
same recommendations Contextual: recommendation depends only on
current activity Persistent: recommendation depends on long-term
interests
Slide 23
What the Data be Context of the current page (current request,
item currently explored and structured content about this context)
History of the current user on the system (explicit or implicit
ratings) History of all users on the system History of the current
user on multiple systems, the whole web or even on its computer
History of all users on multiple systems, the whole web or even
their computer
Slide 24
How to design Recommender System Explicit Data Rating data
(Rate a film in Netflix, Like or Dislike in Youtube) Implicit Data
Log (users activities-the implicit feedback) Recommender System
based on users data
Slide 25
Emerging of New Recommendation Approaches Collaborative
Filtering (Social Recommender) Compare with traditional content
based approach Recommendation from friends Daily recommendation
from friends News feeds, FaceBook, Re-tweet Recommendation over
social media (blog, YouTube) Recommendation by using social data
Social network Social tagging
Slide 26
Multi-Relational Social Data
http://www.dasfa.net/wiki/index.php?title=Image:Metafac.png Node:
facet Hyper-edge: relationship We are in a big social network.
Slide 27
Recommendation from friends- Facebook
Slide 28
Social recommendation by social media
Slide 29
Social relationship is powerful G.Groh et al. Recommendations
in taste related domains, GROUP07, November 47, 2007, Sanibel
Island, Florida, USA Social Filtering approach outperforms CF
approach in the experiments SF vs. CF
Slide 30
Output Input Algorithms Recommender System Overview User-Item
KNN; Clustering-based; Graph-based; Matrix Factorization;
Information Diffusion; Probabilistic model; User item rating Social
relations Social tagging Query Time Location Information item Tags
Merchandise/Ads Persons Community
Slide 31
Outline Why recommender systems The state-of-the-art of
recommender systems Social tagging systems Tag-based recommender
system Personalized recommendation Tag recommendation User
profiling Open research questions Conclusion Appendix: Our recent
work on group approaches
Slide 32
Tags is personal annotation User1 User2 User3 User4 User5
Resources Tag >Metadata >Index >A users personal opinion
expression >Implicit rating or voting on the tagged information
resources or items. Tag
Slide 33
Tagging Types Self-tagging Users can only tag their own
contributions Permission-based Users decide who can tag their
resources Free-for-all Any user can tag any resource
Slide 34
Tagging support Blind tagging User cannot see the other tags
assigned to the resource theyre tagging Viewable tagging Users can
see the other tags assigned to the resource theyre tagging
Suggestive tagging User sees suggested tags for the resource theyre
tagging
Slide 35
Aggregation of Tag Bag-model Same tag can be assigned to a
resource multiple times. (Delicious) Set-model A tag can be applied
only once to a resource. (Flickr)
Slide 36
Tag Temporal Behavior over time Tags convergence The tags
assigned to a certain Web resource tend to stabilize and to become
the majority. Tags divergence Tag-sets dont converge to a smaller
group of more stable tags and where the tag distribution
continually changes. Tags periodicity Tags evolve and decay with
time.
Slide 37
Outline Why recommender systems The state-of-the-art of
recommender systems Social tagging systems Tag-based recommender
system Personalized recommendation Tag recommendation User
profiling Open research questions Conclusion Appendix: Our recent
work on group approaches
Slide 38
Tag based RS Tag based Recommender System Users Resources Tags
t1,t2,t3 t7,t2,t5 t1,t2,t3 t1,t8,t7 t1,t8,t9 t1,t8,t7
Slide 39
Extension of User-Item Tso-Sutter et al. 2008 User tags as
items, Item tags as users,, reduce
Slide 40
Folksonomy model Definition :A folksonomy is a quadruple F :=
(U; T; R; Y), where U, T, R are finite sets of instances of users,
tags, and resources and Y defines a relation, the tag assignment,
between these sets, that is, Y U T R. Converting the Folksonomy
into an Undirected Graph. First we convert the folksonomy F = (U,
T,R, Y ) into an undirected tripartite graph G F = (V,E) as
follows. V = U T R E = {{u, t}, {t, r}, {u, r} | (u, t, r) Y },
with each edge {u, t} being weighted with |{r R : (u, t, r) Y }|,
each edge {t, r} with |{u U : (u, t, r) Y }|, and each edge {u, r}
with |{t T : (u, t, r) Y }| Employ: Adapted PageRank Algorithm
Slide 41
FolkRank Hotho et al ECSW2006 PageRank A page is important if
there many pages linking to it, and if those pages are important
themselves A resource which is tagged with important tags by
important users becomes important itself. (The same holds,
symmetrically, for tags and users.) FolkRank graph of tags has no
direction Directed graphs. Recommend a set of related users and
resources for a given tag.
Slide 42
Difference highlights Documents that are of potential interest
to a user can be suggested to him. When using a certain tag, other
related tags can be suggested. Folk-Rank additionally considers the
tagging behavior of other users. Other users that work on related
topics can be made explicit, improving thus the knowledge transfer
within organizations and fostering the formation of
communities.
Slide 43
Tensor Factorization Symeonidis et al.2008Rendle et al.2009
Tensor Factorization
Slide 44
Tensor factorization HOSVD (Symeonidis et al TKDE 2010) Basic
idea: by optimizing the square loss: Other optimization measure,
e.g., AUC (Area Under Curve) Rendle et al SIGKDD 2009
Slide 45
Others The GroupMe! System (Abel et al. 2007). PLSA
(Probabilistic Latent Semantic Analysis) (Wetzker et al. 2009).
Tag-based profile construction Nave (Szomszor et al. 2007),
co-occurrence (Michlmayr and Cayzer 2007) and adaptation approach
(Dorigo and Caro 1999). WebDCC (Web Document Conceptual Clustering)
(Godoy and Amandi 2006) Music recommendation system (Uitdenbogerd
and van Schnydel 2002)
Slide 46
The limitations Tags have little semantics and many variations
The correlation between sets of tags Uncontrolled vocabulary- users
behavior in their ways Redundancy and ambiguity in the tag database
Tags do not describe the document, but a judgment.
Non-English-speaking language tags, e.g. Vienna, Wien.
Slide 47
Data Quality How to manage the cold start problem (new user,
new item) or more generally data sparsity? The system must have a
special behavior for user with few ratings (eg. not personalized
recommendation) The system may use bot-users to rate new items
according to the content
Slide 48
Confidence and display How to improve the confidence in the
recommender system? By providing good recommendations! By providing
information about each recommendation (eg. Ratings, explanation)
How to display recommendations? The item recommended must be easy
to identify and evaluate by the user Ratings must be easy to
understand and meaningful Explanations must provide a quick way for
the user to evaluate the recommendation
Slide 49
Interaction (1/2) How to interact with the user? You may ask
the user to correct a prediction You must update your rating matrix
with this prediction and update your recommendation accordingly You
may want to learn the key parameters of your algorithm using the
feedback You may ask the user to provide feedback on the
explanation You may ask the user to provide more context for the
current task (eg. by using categories)
Slide 50
Interaction (2/2) How to manage scalability Applications
usually need real-time prediction computation The computation time
has to scale with number of users and items How to manage temporal
changes? You can not run your algorithms each time a modification
occurs The off-line computation must be robust to small
modification and scheduled accordingly The on-line computation must
benefit from modifications The computation must be done
incrementally when possible The system may "forget" older
information
Slide 51
Data and security (1/2) How to insure privacy? If the profile
is public, there is no privacy issues. If the profile is private,
the system should avoid to give too much information using
anonymity techniques. This problem is even worse in
cross-systems
Slide 52
Data and security (2/2) How to design algorithms that are
robust against manipulation? Attacks are characterized by a number
of false users and knowledge on the system. The attacker want to
modify the distribution of the ratings without being easy to
detect. There is a lot of known attacks such as sampling attack,
random attack, average attack, bandwagon attack... Lot of
techniques to detect attack : find profiles which are unlikely
according to the global distribution of profiles, find profiles
updates which are unlikely according to the global distribution of
updates...
Slide 53
Improvement (1/4) How to manage diversity? Recommending very
close items could be counter-productive (since they may be
substitute) ->Systems can use correlation between items (eg.
base on content) to filter items Recommending what everybody like
and what the user already know is not really interesting ) Systems
can try more risky prediction (eg. high score with low
confidence)
Slide 54
Improvement (2/4) How to use social networks to improve
recommendations? Users are likely to like what their friends like.
Exploring the social graph is a direct way to do recommendation
Correlation between user could be biased by the social graph
Potential friends could be suggested using recommendation
techniques.
Slide 55
Improvement (3/4) How to recommend for a group? The
recommendation for the group could be an aggregation of the
recommendation for the members. The group could be seen as a user
(with aggregation functions to reconciliation ratings)
Slide 56
Improvement (4/4) How to evaluate recommendation? There is a
lot of noise on the data, which could be the main source of errors
It is more difficult to evaluate when there is no rating. It is
even more difficult if you want to improve recommendation by adding
constraints like diversity
Slide 57
Possible further research Incorporate relevance feedback into
recommendation Using tag clouds to improve user experience
Examining quantitative aspect of folksonomy and the use of tags
Examining user behavior based on implicit feedback Tag ambiguity
and sparsity Tag uniformity
Slide 58
Conclusion The rationale of recommender systems The
state-of-the-art progress of recommender systems Social tagging
systems demands new recommender systems Techniques in tag-based
recommender systems Some open research questions & possible
further directions