Top Banner

Click here to load reader

of 58

Dr. Guandong Xu Intelligent Web and Information system Department of Computer Science Aalborg University The Research Progress of Recommender Systems in.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1
  • Dr. Guandong Xu Intelligent Web and Information system Department of Computer Science Aalborg University The Research Progress of Recommender Systems in Social Tagging Systems
  • Slide 2
  • Outline Why recommender systems The state-of-the-art of recommender systems Social tagging systems Tag-based recommender system Personalized recommendation Tag recommendation User profiling Open research questions Conclusion Appendix: Our recent work on group approaches
  • Slide 3
  • Why recommender systems The Internet computing era Information overload Low precision: retrieved info is not what you need Low recall: the correctly relevant info is not exhaustively returned Example:
  • Slide 4
  • Why recommender systems No personalized Different users returned the same search results Personalization or recommendation Same results
  • Slide 5
  • Why Recommender System GroupLens: An open architecture for collaborative filtering of netnews, Resnick, P.; Iacovou, N.; Sushak, M.; Bergstrom, P.; Riedl, J., 1994 ACM Conference on Computer Supported
  • Slide 6
  • Outline Why recommender systems The state-of-the-art of recommender systems Social tagging systems Tag-based recommender system Personalized recommendation Tag recommendation User profiling Open research questions Conclusion Appendix: Our recent work on group approaches
  • Slide 7
  • Why Recommender Systems Recommender systems recommendation systems recommendation engines users recommend information items (films, television, video on demand, music, book, news, images, web pages, etc) information filtering system technique Interested in Content-based approach, Collaborative filtering approach Information
  • Slide 8
  • Tradition Recommendation Methodology CategoriesMethodology Content-based TF-IDF Bayesian classifiers ML(Clustering, decision tree, Artificial neural networks) Collaborative recommendation Memory-based Model-based (K-means clustering, Bayesian model, probability relational model, liner regression) Hybrid Adding Content-Based Characteristics to Collaborative Models Adding Collaborative Characteristics to Content-Based Models A single Unifying Recommendation Model
  • Slide 9
  • Recommender System Categories Content-based recommendations Collaborative recommendations Hybrid approaches User Similar items User1 User2 Similar taste items recommend Preferred Content-based &Collaborative Preferredrecommend User Preferred
  • Slide 10
  • Example for Content-based approach Considering a recommendation scenario Page 1: Department of Computer Science at Aalborg University. Page 2: Department of Health Science and Technology Search queue: Computer Science R1={Department, Computer Science, Aalborg University,} R2={Department", "Health Science", "Technology, } Use TF-IDF (term frequency inverse document frequency) Result: R1
  • Slide 11
  • Example for Content-based approach Department of Computer Science at Aalborg University Department of Health Science and Technology R1={Department, Computer Science, Aalborg University,} R1R2 R2={Department", "Health Science", "Technology, } Query Computer Science TF-IDF (term frequency inverse document frequency) Result: R1
  • Slide 12
  • Principle of Collaborative filtering Two kinds of approaches: User-based: select the K similar users (KNN) or called memory-based Item-based: select the closest item set or called model-based
  • Slide 13
  • Example for Collaborative Filtering Example 2 in Amazon.com: The algorithm generates recommendations based on customers who bought this book also brought other book (similar preferences to the user).
  • Slide 14
  • Recommendations
  • Slide 15
  • Similarities and Differences Content-based recommendations Collaborative Filtering recommendations Similarity Vectors of TF-IDF weights Vectors of the actual user-specified ratings
  • Slide 16
  • Limitations CategoriesLimitations Content-based (By keywords) Limited Content Analysis Over Specialization New User Problem Collaborative recommendation New User Problem - cold start New Item Problem cold start Sparsity Problem Hybrid N/A
  • Slide 17
  • Some Extending Capabilities (1/2) Comprehensive Understanding of Users and Items Extensions for Model-Based Recommendation Techniques Multidimensionality of Recommendations Extend 2-Dimensional to Multi-dimensional User Item User Item other1 other2 User, Movie, Time, Place
  • Slide 18
  • Some Extending Capabilities (2/2) Multi-criteria Ratings Restaurant (food, dcor, service, price) Non-intrusiveness Flexibility Users flexibility Effectiveness of Recommendations Metrics related
  • Slide 19
  • Insights of recommender systems Closely look at recommender systems from different perspectievs
  • Slide 20
  • What do with data - implementation Two kinds of problem with data: Information retrieval (IR): static content, dynamic query -> modeling content (organized with index) Information Filtering (IF): dynamic content, static query -> modeling query (organized as filters) Recommendation is between IR and IF since the content varies slowly and the queries depend of few parameters. Methods of both IR and IF are then used to reduce computation at query time.
  • Slide 21
  • General purpose Top-k filtering: list of "best" items (main usage) or anti-spam Items correlation: find similar items Prediction of rating: predict any pair between any pair of an user and an item (more general)
  • Slide 22
  • Degree of personalization Generic: everyone receives same recommendations Demographic: everyone in the same category receives same recommendations Contextual: recommendation depends only on current activity Persistent: recommendation depends on long-term interests
  • Slide 23
  • What the Data be Context of the current page (current request, item currently explored and structured content about this context) History of the current user on the system (explicit or implicit ratings) History of all users on the system History of the current user on multiple systems, the whole web or even on its computer History of all users on multiple systems, the whole web or even their computer
  • Slide 24
  • How to design Recommender System Explicit Data Rating data (Rate a film in Netflix, Like or Dislike in Youtube) Implicit Data Log (users activities-the implicit feedback) Recommender System based on users data
  • Slide 25
  • Emerging of New Recommendation Approaches Collaborative Filtering (Social Recommender) Compare with traditional content based approach Recommendation from friends Daily recommendation from friends News feeds, FaceBook, Re-tweet Recommendation over social media (blog, YouTube) Recommendation by using social data Social network Social tagging
  • Slide 26
  • Multi-Relational Social Data http://www.dasfa.net/wiki/index.php?title=Image:Metafac.png Node: facet Hyper-edge: relationship We are in a big social network.
  • Slide 27
  • Recommendation from friends- Facebook
  • Slide 28
  • Social recommendation by social media
  • Slide 29
  • Social relationship is powerful G.Groh et al. Recommendations in taste related domains, GROUP07, November 47, 2007, Sanibel Island, Florida, USA Social Filtering approach outperforms CF approach in the experiments SF vs. CF
  • Slide 30
  • Output Input Algorithms Recommender System Overview User-Item KNN; Clustering-based; Graph-based; Matrix Factorization; Information Diffusion; Probabilistic model; User item rating Social relations Social tagging Query Time Location Information item Tags Merchandise/Ads Persons Community
  • Slide 31
  • Outline Why recommender systems The state-of-the-art of recommender systems Social tagging systems Tag-based recommender system Personalized recommendation Tag recommendation User profiling Open research questions Conclusion Appendix: Our recent work on group approaches
  • Slide 32
  • Tags is personal annotation User1 User2 User3 User4 User5 Resources Tag >Metadata >Index >A users personal opinion expression >Implicit rating or voting on the tagged information resources or items. Tag
  • Slide 33
  • Tagging Types Self-tagging Users can only tag their own contributions Permission-based Users decide who can tag their resources Free-for-all Any user can tag any resource
  • Slide 34
  • Tagging support Blind tagging User cannot see the other tags assigned to the resource theyre tagging Viewable tagging Users can see the other tags assigned to the resource theyre tagging Suggestive tagging User sees suggested tags for the resource theyre tagging
  • Slide 35
  • Aggregation of Tag Bag-model Same tag can be assigned to a resource multiple times. (Delicious) Set-model A tag can be applied only once to a resource. (Flickr)
  • Slide 36
  • Tag Temporal Behavior over time Tags convergence The tags assigned to a certain Web resource tend to stabilize and to become the majority. Tags divergence Tag-sets dont converge to a smaller group of more stable tags and where the tag distribution continually changes. Tags periodicity Tags evolve and decay with time.
  • Slide 37
  • Outline Why recommender systems The state-of-the-art of recommender systems Social tagging systems Tag-based recommender system Personalized recommendation Tag recommendation User profiling Open research questions Conclusion Appendix: Our recent work on group approaches
  • Slide 38
  • Tag based RS Tag based Recommender System Users Resources Tags t1,t2,t3 t7,t2,t5 t1,t2,t3 t1,t8,t7 t1,t8,t9 t1,t8,t7
  • Slide 39
  • Extension of User-Item Tso-Sutter et al. 2008 User tags as items, Item tags as users,, reduce
  • Slide 40
  • Folksonomy model Definition :A folksonomy is a quadruple F := (U; T; R; Y), where U, T, R are finite sets of instances of users, tags, and resources and Y defines a relation, the tag assignment, between these sets, that is, Y U T R. Converting the Folksonomy into an Undirected Graph. First we convert the folksonomy F = (U, T,R, Y ) into an undirected tripartite graph G F = (V,E) as follows. V = U T R E = {{u, t}, {t, r}, {u, r} | (u, t, r) Y }, with each edge {u, t} being weighted with |{r R : (u, t, r) Y }|, each edge {t, r} with |{u U : (u, t, r) Y }|, and each edge {u, r} with |{t T : (u, t, r) Y }| Employ: Adapted PageRank Algorithm
  • Slide 41
  • FolkRank Hotho et al ECSW2006 PageRank A page is important if there many pages linking to it, and if those pages are important themselves A resource which is tagged with important tags by important users becomes important itself. (The same holds, symmetrically, for tags and users.) FolkRank graph of tags has no direction Directed graphs. Recommend a set of related users and resources for a given tag.
  • Slide 42
  • Difference highlights Documents that are of potential interest to a user can be suggested to him. When using a certain tag, other related tags can be suggested. Folk-Rank additionally considers the tagging behavior of other users. Other users that work on related topics can be made explicit, improving thus the knowledge transfer within organizations and fostering the formation of communities.
  • Slide 43
  • Tensor Factorization Symeonidis et al.2008Rendle et al.2009 Tensor Factorization
  • Slide 44
  • Tensor factorization HOSVD (Symeonidis et al TKDE 2010) Basic idea: by optimizing the square loss: Other optimization measure, e.g., AUC (Area Under Curve) Rendle et al SIGKDD 2009
  • Slide 45
  • Others The GroupMe! System (Abel et al. 2007). PLSA (Probabilistic Latent Semantic Analysis) (Wetzker et al. 2009). Tag-based profile construction Nave (Szomszor et al. 2007), co-occurrence (Michlmayr and Cayzer 2007) and adaptation approach (Dorigo and Caro 1999). WebDCC (Web Document Conceptual Clustering) (Godoy and Amandi 2006) Music recommendation system (Uitdenbogerd and van Schnydel 2002)
  • Slide 46
  • The limitations Tags have little semantics and many variations The correlation between sets of tags Uncontrolled vocabulary- users behavior in their ways Redundancy and ambiguity in the tag database Tags do not describe the document, but a judgment. Non-English-speaking language tags, e.g. Vienna, Wien.
  • Slide 47
  • Data Quality How to manage the cold start problem (new user, new item) or more generally data sparsity? The system must have a special behavior for user with few ratings (eg. not personalized recommendation) The system may use bot-users to rate new items according to the content
  • Slide 48
  • Confidence and display How to improve the confidence in the recommender system? By providing good recommendations! By providing information about each recommendation (eg. Ratings, explanation) How to display recommendations? The item recommended must be easy to identify and evaluate by the user Ratings must be easy to understand and meaningful Explanations must provide a quick way for the user to evaluate the recommendation
  • Slide 49
  • Interaction (1/2) How to interact with the user? You may ask the user to correct a prediction You must update your rating matrix with this prediction and update your recommendation accordingly You may want to learn the key parameters of your algorithm using the feedback You may ask the user to provide feedback on the explanation You may ask the user to provide more context for the current task (eg. by using categories)
  • Slide 50
  • Interaction (2/2) How to manage scalability Applications usually need real-time prediction computation The computation time has to scale with number of users and items How to manage temporal changes? You can not run your algorithms each time a modification occurs The off-line computation must be robust to small modification and scheduled accordingly The on-line computation must benefit from modifications The computation must be done incrementally when possible The system may "forget" older information
  • Slide 51
  • Data and security (1/2) How to insure privacy? If the profile is public, there is no privacy issues. If the profile is private, the system should avoid to give too much information using anonymity techniques. This problem is even worse in cross-systems
  • Slide 52
  • Data and security (2/2) How to design algorithms that are robust against manipulation? Attacks are characterized by a number of false users and knowledge on the system. The attacker want to modify the distribution of the ratings without being easy to detect. There is a lot of known attacks such as sampling attack, random attack, average attack, bandwagon attack... Lot of techniques to detect attack : find profiles which are unlikely according to the global distribution of profiles, find profiles updates which are unlikely according to the global distribution of updates...
  • Slide 53
  • Improvement (1/4) How to manage diversity? Recommending very close items could be counter-productive (since they may be substitute) ->Systems can use correlation between items (eg. base on content) to filter items Recommending what everybody like and what the user already know is not really interesting ) Systems can try more risky prediction (eg. high score with low confidence)
  • Slide 54
  • Improvement (2/4) How to use social networks to improve recommendations? Users are likely to like what their friends like. Exploring the social graph is a direct way to do recommendation Correlation between user could be biased by the social graph Potential friends could be suggested using recommendation techniques.
  • Slide 55
  • Improvement (3/4) How to recommend for a group? The recommendation for the group could be an aggregation of the recommendation for the members. The group could be seen as a user (with aggregation functions to reconciliation ratings)
  • Slide 56
  • Improvement (4/4) How to evaluate recommendation? There is a lot of noise on the data, which could be the main source of errors It is more difficult to evaluate when there is no rating. It is even more difficult if you want to improve recommendation by adding constraints like diversity
  • Slide 57
  • Possible further research Incorporate relevance feedback into recommendation Using tag clouds to improve user experience Examining quantitative aspect of folksonomy and the use of tags Examining user behavior based on implicit feedback Tag ambiguity and sparsity Tag uniformity
  • Slide 58
  • Conclusion The rationale of recommender systems The state-of-the-art progress of recommender systems Social tagging systems demands new recommender systems Techniques in tag-based recommender systems Some open research questions & possible further directions