Politecnico di Milano Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Antonio Tripodi Roberto Turrin Politecnico di.

Post on 27-Mar-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiAntonio TripodiRoberto Turrin

Politecnico di MilanoContentWise

Today recommendations,based on your personal taste, are:

From this….

To this

iTV with personalization

Personalization: how it works

USERDATA

USER’S TASTE

FRUTIONS ANDRATINGS

CONTENTMETADATA

RECOMMENDER

SYSTEM

CONTENTRECOMMENDATIONS

4

CustomersService Provider Network ProviderContent Provider

IPTV architecture

Head end

VOD

Set-top-box(decoder)

user u

item i

5

Single domain

Single domain

Single domain

Single domain

Politecnico di Milano

• Two ideas– One aII from Ricci– Closure (on UU and II)

Politecnico di Milano

• Inserire disegnino IPTV• Recommender systems can be divided into two

families– Content-Based Filtering– Collaborative Filtering

• CB algorithms are preferred– Do not rely on metadata (very difficult to obtain in the

TV domain)– Quality has been proved to be better in terms of

accuracy an serendipity, if the system has been trained with enough data

Politecnico di Milano

• CF algorithms can be classified into – Model-based (able to deal with new users)– Non nodel-based (not able to deal with new users)

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiPaolo Garza

Elisa QuintarelliRoberto Turrin

Politecnico di MilanoContentWise

Short version

Politecnico di Milano

Thanks for your attention

Q&A

For any further information, please contact

Paolo Cremonesipaolo.cremonesi@polimi.it

Politecnico di Milano

Top-N recommendations on Unpopular Items

with Contextual Knowledge

Paolo CremonesiPaolo Garza

Elisa QuintarelliRoberto Turrin

Politecnico di MilanoContentWise

Long version

Politecnico di Milano

Research objectives

• Focus– Top-N recommendation task

• Goal– Improving accuracy– Providing explanation

• Requirements– Modularity (algorithm-independent)– Fast on-line recommendations

Politecnico di Milano

Algorithms

23

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

Politecnico di Milano

TopPop and MovieAvg

24

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

Recommends the top-N popular items (i.e., the most rated items), regardless the user preferences and taste

Politecnico di Milano

TopPop

• Pirates of the Caribbean: The Curse of the Black Pearl

• Forrest Gump• The Lord of the Rings: The Two Towers• The Lord of the Rings: The Fellowship of the

Ring• The Sixth Sense

Politecnico di Milano

Collaborative - Neighborhood

26

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

They recommend items according to the approach: “who bought this also bought this..”Amazon like …

Politecnico di Milano

Collaborative – Latent factors

27

Accu

racy CorNgbr

Non-personalized

Neighborhood Latent factors

NNCosNgbr

AsySVD PureSVD

Collaborative

TopPop MovieAvg

They recommend items on the basis of an advanced representation of users and items in a low-dimensional feature space

Politecnico di Milano

Contextual recommendations

• Pre-filtering– L.Baltrunas, F.Ricci

RecSys'09• Post-filtering– U.Panniello, A.Tuzhilin, M.Gorgoglione, ...

RecSys'09• Contextual modeling– M.Domingue, A.Jorge, C.Soares

RecSys'09– C. Palmisano, A.Tuzhilin, M.Gorgoglione

IEEE Trans. Knowl. Data Eng., 2008

Politecnico di Milano

Association rules

• Data mining technique

• Uses “frequency based” approach to find conditional probability of events

• Forrest Gump and Nikita → Avatar

Politecnico di Milano

Association rules

• X→Y• X = previously watched movie(s)• Y = movie(s) the user will likely appreciate• Quality of association rules: – Support: frequency of the rule– Confidence: conditional probability of Y given X

• Benefits (by definition)– best recommendations in terms of accuracy

Politecnico di Milano

Association rules and RS

• Sarwar et al. Analysis of recommendation algorithms for e-commerce, EC 2000

• Computational requirements– theoretically we should test for all the possible

combinations of items in X and Y• Portfolio effect– most rules find the same small set of consequents– recommendations are biased toward obvious

items

Politecnico di Milano

Portfolio effect

Politecnico di Milano

Portfolio effect

• Which is the most simple and yet most effective association-rule based recommender system?

Politecnico di Milano

Portfolio effect

• Which is the most simple and yet most effective association-rule based recommender system?

• TopPop

Politecnico di Milano

Recall on Netflix

Algorithm Recall at 10

PureSVD50 0.48NNCosNgbr 0.45AsySVD 0.30TopPop 0.28CorNgbr 0.15MovieAvg 0.12

Politecnico di Milano

Recall on Netflix

Algorithm Recall at 10 Recall at 10long tail only

PureSVD50 0.48 0.30NNCosNgbr 0.45 0.28AsySVD 0.30 0.30TopPop 0.28 0.02CorNgbr 0.15 0.35MovieAvg 0.12 0.12

Removed the most popular itemsaccounting for 33% of ratings

Politecnico di Milano

Measured perceived quality

• Users’ judgments on Accuracy and Novelty

• Participants:30 users per 7 experimental condition → 210 users overall

• Profile: 20 - 50 years old male: 54% - female: 46%

Perceived relevance

Perceived novelty

Politecnico di Milano

Context recommender system

• Traditional Recommender

System

• Users’ contexts - Items’ characteristics

• Contextual Rule Mining

• Recommendations • Contextual rules

• Contextual Post-filtering

• Contextual Recommendations

• Users - Items

Politecnico di Milano

Experiments: rules mining

• Movielens: 1 M ratings– 1000 users– 1700 items

• Context– # age ranges = 7– # gender = M/F

• Movie features– # genres = 18

• Rules mined with FP-growth• Min support = 1000

Politecnico di Milano

Goal

• Identify correlations between user’s context and item characteristics

• Filter predictions performed by a traditional recommender

Politecnico di Milano

Inputs to the system

• Input to the recommender system

URM

• Input to the contextual rule miner

CFM – User context × Item features– number of ratings users in context c gave to items

with feature f

Politecnico di Milano

Creation of the transactional dataset

• UCM → Transactional dataset•

Example:A rating given by a Male with age [20-25] to a fantasy movie

(gender = M)(age = [20-25])(genre = fantasy)

is included in the transactional dataset

Politecnico di Milano

Rule mining: example

• The following two rules are extracted for the context (gender = M):

(gender = M) ) (genre = horror)(gender = M) ) (genre = action)

• It follows that only horror and action movies can be recommended to male users

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000

Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportM 35-44 Action 23% 34000

Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000

Politecnico di Milano

Two options

• Keep all of the rules

• Keep only rules with a “large” confidence15%

• In any case, we keep only rules with a large support (>1000 ratings)

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportF 35-44 Drama 35% 17000

Comedy 32% 16000Romance 18% 9000Children’s 8% 4000Musical 5% 2500Animation 4% 2000Mystery 4% 2000Fantasy 3% 1500

Politecnico di Milano

Example of rules …

Gender Age Genre Prob. SupportM 35-44 Action 23% 34000

Thriller 16% 24000Sci-Fi 14% 21000Adventure 12% 17000War 7% 10000Horror 6% 9000Mystery 3% 5000Western 2% 3000Noir 2% 3000

Politecnico di Milano

Results

Politecnico di Milano

Recall at 5 on long-tail items

Long tail Without context With contextPureSVD 21% 30%AsySVD 7% 12%NNCosNgbr 10% 24%CorNgbr 4% 9%TopPop 0.1% 8%MovieAvg 1% 5%

Removed 5% of the most popular itemsAccounting for 33% of ratings

Politecnico di Milano

Recall with non-personal methods

Politecnico di Milano

Recall with neighborhood methods

Politecnico di Milano

Recall with latent-factors methods

Politecnico di Milano

Thanks for your attention

Q&A

For any further information, please contact

Paolo Cremonesipaolo.cremonesi@polimi.it

top related