Top Banner
Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel´ anek 2017
59

Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Sep 12, 2018

Download

Documents

doannhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommender Systems: Content-based,

Knowledge-based, Hybrid

Radek Pelanek

2017

Page 2: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Today

lecture, basic principles:

content-basedknowledge-basedhybrid, choice of approach, . . .critiquing, explanations, . . .

discussion – projects

brief presentation of your projectsapplication of covered notions to projects⇒ make notes during lecture

Page 3: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content-based vs Collaborative Filtering

collaborative filtering: “recommend items that similarusers liked”

content based: “recommend items that are similar tothose the user liked in the past”

Page 4: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content-based Recommendations

we need explicit (cf latent factors in CF):

information about items (e.g., genre, author)

user profile (preferences)

Recommender Systems: An Introduction (slides)

Page 5: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Architecture of a Content-Based Recommender

Handbook of Recommender Systems

Page 6: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content

Recommender Systems: An Introduction (slides)

Page 7: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content: Multimedia

manual anotation

songs, hundreds of featuresPandora, http://www.pandora.comMusic Genome Projectexperts, 20-30 minutes per song

automatic techniques – signal processing

Page 8: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

User Profile

explicitly specified by user

automatically learned

easier than in CF – features of items are now available

Page 9: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Similarity: Keywords

general similarity approach based on keywords

two sets of keywords A,B (description of two items ordescription of item and user)

how to measure similarity of A and B

Page 10: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Similarity: Keywords

sets of keywords A, B

Dice coefficient: 2·|A∩B||A|+|B|

Jaccard coefficient: |A∩B||A∪B|

many other coefficients available, see e.g. “A Survey of BinarySimilarity and Distance Metrics”

Page 11: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Term Frequency – Inverse Document Frequency

keywords (particularly automatically extracted) –disadvantages:

importance of words (“course” vs “recommender”)length of documents

TF-IDF – standard technique in information retrieval

Term Frequency – how often term appears in aparticular document (normalized)Inverse Document Frequency – how often term appearsin all documents

Page 12: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Term Frequency – Inverse Document Frequency

keyword (term) t, document d

TF (t, d) = frequency of t in d / maximal frequency of aterm in d

IDF (t) = log(N/nt)

N – number of all documentsnt – number of documents containing t

TFIDF (t, d) = TF (t, d) · IDF (t)

Page 13: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Similarity

similarity between user and item profiles (or two item profiles):

vector of keywords and their TF-IDF values

cosine similarity – angle between vectors

sim(~a,~b) = ~a·~b|~a||~b|

(adjusted) cosine similarity

normalization by subtracting average valuesclosely related to Pearson correlation coefficient

Page 14: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommendations by Nearest Neighbors

k-nearest neighbors (kNN)

predicting rating for not-yet-seen item i :

find k most similar items, already ratedpredict rating based on these

good for modeling short-term interest, “follow-up” stories

more complex methods available, e.g., Rocchio’s relevancefeedback method (interactivity)

Page 15: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Improvements

all words – long, sparse vectors

common words, stop words (e.g., “a”, “the”, “on”)

stemming (e.g., “went” → “go”, “university” →“univers”)

cut-offs (e.g., n most informative words)

phrases (e.g., “United Nations”, “New York”)

wider context: natural language processing techniques

Page 16: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Limitations

semantic meaning unknown

example – use of words in negative context

steakhouse description: “there is nothing on the menu that a

vegetarian would like...” ⇒ keyword “vegetarian” ⇒ recommended

to vegetarians

Page 17: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Ontologies, Taxonomies, Folkosomies

ontology – formal definition of entities and their relations

taxonomy – tree, hierarchy (example: news, sport, soccer,soccer world cup)

folksonomy (folk + taxonomy) – collaborative tagging,tag clouds

Page 18: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommendation as Classification

classification problem: features → like/dislike (rating)

use of general machine learning techniques

probabilistic methods – Naive Bayeslinear classifiersdecision treesneural networks. . .

wider context: machine learning techniques

Page 19: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content-Based Recommendations: Advantages

user independence – does not depend on other users

transparency – explanations, understandable

new items can be easily incorporated (no cold start)

Page 20: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content-Based Recommendations: Limitations

limited content analysis

content may not be automatically extractable(multimedia)missing domain knowledgekeywords may not be sufficient

overspecialization – “more of the same”, too similar items

new user – ratings or information about user has to becollected

Page 21: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Content-Based vs Collaborative Filtering

paper “Recommending new movies: even a few ratingsare more valuable than metadata” (context: Netflix)

our experience in educational domain – difficulty rating(Sokoban, countries)

Page 22: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Knowledge-based Recommendations

application domains:

expensive items, not frequently purchased, few ratings(car, house)

time span important (technological products)

explicit requirements of user (vacation)

collaborative filtering unusable – not enought data

content based – “similarity” not sufficient

Page 23: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Knowledge-based Recommendations

constraint-based

explicitly defined conditions

case-based

similarity to specified requirements

“conversational” recommendations

Page 24: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Constraint-Based Recommmendations – Example

Recommender Systems: An Introduction (slides)

Page 25: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Constraint Satisfaction Problem

V is a set of variables

D is a set of finite domains of these variables

C is a set of constraints

Typical problems: logic puzzles (Sudoku, N-queen), scheduling

Page 26: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

CSP: N-queens

problem: place N queens on an N × N chess-board, no twoqueens threaten each other

V – N variables (locations of queens)

D – each domain is {1, . . . ,N}C – threatening

Page 27: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

CSP Algorithms

basic algorithm – backtracking

heuristics

preference for some branchespruning... many others

Page 28: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

CSP Example: N-queens Problem

Page 29: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommender Knowledge Base

customer properties VC

product properties VPROD

constraints CR (on customer properties)

filter conditions CF – relationship between customer andproduct

products CPROD – possible instantiations

Page 30: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommender Systems Handbook; Developing Constraint-based Recommenders

Page 31: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Recommender Systems Handbook; Developing Constraint-based Recommenders

Page 32: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Development of Knowledge Bases

difficult, expensive

specilized graphical tools

methodology (rapid prototyping, detection of faultyconstraints, ...)

Page 33: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Unsatisfied Requirements

no solution to provided constraints

we want to provide user at least something

constraint relaxation

proposing “repairs”

minimal set of requirements to be changed

Page 34: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

User Guidance

requirements elicitation process

session independent user profile (e.g., social networkingsites)

static fill-out forms

conversational dialogs

Page 35: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

User Guidance

Recommender Systems Handbook; Developing Constraint-based Recommenders

Page 36: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

User Guidance

Recommender Systems Handbook; Developing Constraint-based Recommenders

Page 37: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Critiquing

Recommender Systems: An Introduction (slides)

Page 38: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Critiquing

Recommender Systems: An Introduction (slides)

Page 39: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Critiquing: Example

A Visual Interface for Critiquing-based Recommender Systems

Page 40: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Critiquing: Example

Critiquing-based recommenders: survey and emerging trends

Page 41: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Critiquing: Example

Critiquing-based recommenders: survey and emerging trends

Page 42: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Limitations

cost of knowledge acquisition (consider your projectproposals)

accuracy of models

independence assumption for preferences

Page 43: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Hybrid Methods

collaborative filtering: “what is popular among my peers”content-based: “more of the same”knowledge-based: “what fits my needs”

each has advantages and disadvantages

hybridization – combine more techniques, avoid someshortcomings

simple example: CF with content-based (or simple“popularity recommendation”) to overcome “cold startproblem”

Page 44: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Hybridization Designs

monolitic desing, combining different features

parallel use of several systems, weighting/voting

pipelined invocation of different systems

Page 45: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Types of Recommender Systems

non-personalized

demographic

collaborative filtering

content based

knowledge-based

hybrid

what to apply when?

Page 46: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Taxonomy of Knowledge Sources

Matching Recommendation Technologies and Domains

Page 47: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Knowledge Sources and Recommendation Types

Matching Recommendation Technologies and Domains

Page 48: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Sample Domains for Recommendation

Matching Recommendation Technologies and Domains

Page 49: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Explanations of Recommendations

recommendations: selection (ranked list) of items

explanations: (some) reasons for the choice

Page 50: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Goals of Providing Explanations

Why explanations?

transparency, trustworthiness, validity, satisfaction (usersare more likely to use the system)

persuasiveness (users are more likely to followrecommendations)

effectiveness, efficiency (users can make better/fasterdecisions)

education (users understand better the behaviour of thesystem, may use it in better ways)

Page 51: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Goals of Providing Explanations

Why explanations?

transparency, trustworthiness, validity, satisfaction (usersare more likely to use the system)

persuasiveness (users are more likely to followrecommendations)

effectiveness, efficiency (users can make better/fasterdecisions)

education (users understand better the behaviour of thesystem, may use it in better ways)

Page 52: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Examples of Explanations

knowledge-based recommenders

“Because you, as a customer, told us that simplehandling of car is important to you, we included aspecial sensor system in our offer that will help you parkyour car easily.”algorithms based on CSP representation

recommendations based on item-similarity

“Because you watched X we recommend Y”

Page 53: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Examples of Explanations

knowledge-based recommenders

“Because you, as a customer, told us that simplehandling of car is important to you, we included aspecial sensor system in our offer that will help you parkyour car easily.”algorithms based on CSP representation

recommendations based on item-similarity

“Because you watched X we recommend Y”

Page 54: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Explanations – Collaborative Filtering

Explaining Collaborative Filtering Recommendations, Herlocker, Konstan, Riedl

Page 55: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Explanations – Collaborative Filtering

Explaining Collaborative Filtering Recommendations, Herlocker, Konstan, Riedl

Page 56: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Explanations – Comparison

Explaining Collaborative Filtering Recommendations, Herlocker, Konstan, Riedl

Page 57: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Moment of Recommendation

front page, dashboard

follow-up

sidebar

on demand

Page 58: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Your Projects: Questions

What is the purpose / use case? What is the “businessmodel”?

What type of recommendetions?

A new system or extention of an existing one?

Where/how will you obtain data?

itemsuser preferences; explicit/implicit ratings?

Which techniques are relevant/suitable for you project?Collaborative filtering? Content-based?Knowledge-based? Combination?

Are the following notions relevant: taxonomy, critiquing,explanations?

Page 59: Recommender Systems: Content-based, Knowledge-based, Hybridxpelanek/PV254/slides/other-techniques.pdf · Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pel anek

Projects

1 research project: slepemapy.cz data

2 board games

3 quotes

4 jokes

5 recipes I: allrecepies.com

6 recipes II

7 educational

8 travel

9 travel or beer