Top Banner
Collaborative Filtering Recommendation Techniques Yehuda Koren
47

Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Dec 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Collaborative Filtering Recommendation Techniques

Yehuda Koren

Page 2: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Recommendation Types

Editorial Simple aggregates Top 10, Most Popular, Recent Uploads

Tailored to individual users Books, CDs, other products at amazon.com Movies by Netflix, MovieLens TV Shows by TiVo …

Page 3: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

3

Recommendation Process

Collecting “known” user-item ratings Extrapolate unknown ratings from

known ratings Estimate ratings for the items that have not

been seen by a user Recommend the items with the highest

estimated ratings to a user

Page 4: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Collaborative Filtering

Page 5: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Collaborative filtering• Recommend items based on past transactions of many

users• Analyze relations between users and/or items• Specific data characteristics are irrelevant

– Domain-free: user/item attributes are not necessary– Can identify elusive aspects

Page 6: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Page 7: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

“We’re quite curious, really. To the tune of one million dollars.” – Netflix Prize rules

• Goal to improve on Netflix’ existing movie recommendation technology, Cinematch

• Criterion: reduction in root mean squared error (RMSE)• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement• Oct’08: $50K progress prize for 9.44% improvement• Sept’09: $1 million grand prize for 10.06% improvement

Page 8: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

scoremovieuser1211521314345241232376825763445415685234252234557664566

movieuser?621?961?72?32?473?153?414?284?935?745?696?836

Training data Test data

Movie rating data

• Training data– 100 million

ratings– 480,000 users– 17,770 movies– 6 years of data:

2000-2005• Test data

– Last few ratings of each user (2.8 million)

• Dates of ratings are given

Page 9: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Test Data Split into Three Pieces

• Probe– Ratings released– Allows participants to assess

methods directly

• Daily submissions allowed for combined Quiz/Test data– RMSE released for Quiz– Prizes based on Test RMSE– Identity of Quiz cases

withheld– Test RMSE withheld

Training Data

Training Data Hold-out Set(last 9 ratings for each user:

4.2M pairs)

All Data(~ 103 M user-item pairs)

Quiz Test

Labels provided Labels retained by Netflix for scoring

Random 3-way split

Probe

Page 10: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

#ratings per user

• Avg #ratings/user: 208

Page 11: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research11

Most Active Users

User ID # Ratings Mean Rating305344 17,651 1.90387418 17,432 1.81

2439493 16,560 1.221664010 15,811 4.262118461 14,829 4.081461435 9,820 1.371639792 9,764 1.331314869 9,739 2.95

Page 12: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

#ratings per movie

• Avg #ratings/movie: 5627

Page 13: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Movies Rated Most Often

Title # Ratings Mean RatingMiss Congeniality 227,715 3.36Independence Day 216,233 3.72The Patriot 200,490 3.78The Day After Tomorrow 194,695 3.44Pretty Woman 190,320 3.90Pirates of the Caribbean 188,849 4.15The Green Mile 180,883 4.31Forrest Gump 180,736 4.30

Page 14: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Important RMSEs

Prize’07 (BellKor): 0.8712

Cinematch: 0.9514; baseline

Movie average: 1.0533

User average: 1.0651

Global average: 1.1296

Inherent noise: ????

Personalization

erroneous

accurate

Prize’08 (BellKor+BigChaos): 0.8616

Grand Prize (BellKor’s Pragmatic Chaos) : 0.8554

Random ranking

Bestsellers list

Page 15: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Problem Definition

1 4 3

4

4 4

4

2

Page 16: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

#17,770

#3

#2

#1

Users➔Items #480,000#3#2#1

Users/items arranged in ratings space

Latent factorization algorithm

• Dim(Users)≠ Dim(Items) (E.g., 17,770-vs-480,000)• Sparse data, with non-uniformly missing entries

Latent factor methods

Page 17: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

0.70.950.371.41.91.4User-480K

1.11.871.20.880.130.19User-3

1.951.050.970.041.10.77User-2

1.30.671.20.370.490.08User-1

0.170.870.760.430.120.44Item-17,770

1.10.550.370.12.10.95Item-3

0.371.251.351.190.011.1Item-2

0.250.011.72.10.81.2Item-1

Latent factorization algorithm

Users/items arranged in joint dense latent factors space

Page 18: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Geared towards females

Geared towards males

serious

escapist

The PrincessDiaries

The Lion King

Braveheart

Lethal Weapon

Independence Day

AmadeusThe Color Purple

Dumb and Dumber

Ocean’s 11

Sense and Sensibility

Gus

Dave

A 2-D factor space

Page 19: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Basic matrix factorization model

45531

312445

53432142

24542

522434

42331

items

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

items

users

users

A rank-3 SVD approximation

Page 20: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

items

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

items

users

A rank-3 SVD approximation

users

?

Page 21: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

items

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

items

users

A rank-3 SVD approximation

users

?

Page 22: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Estimate unknown ratings as inner-products of factors:

45531

312445

53432142

24542

522434

42331

items

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1

~

~

items

users

2.4

A rank-3 SVD approximation

users

Page 23: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Matrix factorization model45531

312445

53432142

24542

522434

42331

.2-.4.1

.5.6-.5

.5.3-.2

.32.11.1

-22.1-.7

.3.7-1

-.92.41.4.3-.4.8-.5-2.5.3-.21.1

1.3-.11.2-.72.91.4-1.31.4.5.7-.8

.1-.6.7.8.4-.3.92.41.7.6-.42.1~

Idea: • Approximate the rating matrix as the product of two

lower-rank matrices: R=PQProperties:• SVD isn’t defined when entries are unknown use

specialized methods• Very powerful model can easily overfit, sensitive to

regularization

Page 24: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

A regularized model

• User factors:Model a user u as a vector pu ~ Nk(µ, Σ)

• Movie factors:Model a movie i as a vector qi ~ Nk(γ, Λ)

• Ratings:Measure “agreement” between u and i: rui ~ N(pu

Tqi, ε2)• Simplifying assumptions:

µ =γ = 0, Σ= Λ = λI

Page 25: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Matrix factorization as a cost function

ˆ Tui u ir p q=

regularization- user-factor of u

- item-factor of i

- rating by u for iuiriqup

• Optimize by either stochastic gradient-descent or alternating least squares

prediction

( ) ( )* *

2 2 2,

knownMin

ui

Tp q ui u i u i

rr p q p qλ− + +∑

Rating prediction:

Page 26: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Stochastic gradient descent optimization

• For each training example rui :– Compute prediction error: eui = rui – pu

Tqi

– Update item factor: qi qi+γ(pueui-λqi) – Update user factor: pu pu+γ(qieui-λpu)

Perform till convergence:

• Two constants to tune: γ (step size) and λ (regularization) • Cross validation: find values that minimize error on test set

Notation: - rating by user u to item iuir

Page 28: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Page 30: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Data normalization

• Most variability in observed data is driven by user-specific and item-specific effects, regardless of user-item interaction

• Examples:– Some movies are systematically rated higher– Some movies were rated by users that tend to rate low – Ratings change along time

• Data must be adjusted to account for these main effects• This stage requires most insights into the nature of the data• Can make a big difference…

Page 31: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Components of a rating predictor

user-item interactionitem biasuser bias

User-item interaction• Characterizes the match

between users and items• Attracts most research in the

field• Benefits from algorithmic and

mathematical innovations

Biases• Separates users and movies• Often overlooked• Benefits from insights into

users’ behavior

ub + ib + Tu ip quir =

Page 32: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

A bias estimator

• We have expectations on the rating by user u to item i, even without estimating u’s attitude towards items like i

– Rating scale of user u– Values of other ratings the

user gave recently

– (Recent) popularity of item i– Selection bias

Page 33: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Biases: an example

• Mean rating: 3.7 stars• The Sixth Sense is 0.5 stars above avg• Joe rates 0.2 stars below avgBaseline estimation:

Joe will rate The Sixth Sense 4 stars

Page 34: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Biases 33%

Personalization 10%

Unexplained57%

Sources of Variance in Netflix data

1.276 (total variance)

0.732 (unexplained)0.415 (biases)0.129 (personalization)

++

Biases matter!

Page 35: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Exploring Temporal Effects

2004

Netflix ratings by date

Something Happened in Early 2004…

Page 36: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Are movies getting better with time?

Page 37: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Multiple sources of temporal dynamics

• Item-side effects:– Product perception and popularity are constantly changing– Seasonal patterns influence items’ popularity

• User-side effects:– Customers redefine their taste– Transient, short-term bias; anchoring– Drifting rating scale– Change of rater within household

Page 38: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Introducing temporal dynamics into biases

• Biases tend to capture most pronounced aspects of temporal dynamic

• We observe changes in:1. Rating scale of individual users (user bias)2. Popularity of individual items (item bias)

( )ˆ (( )) Tu i i uui b b ptt tr q= + +

ˆ Tui u i u ir b b p q= + +

Add temporal dynamics

Page 39: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

General Lessons and Experience

Page 40: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

• 1.4M predictions are split into 10 equal bins based on #ratings per user

Some users are more predictable…

0.7

0.75

0.8

0.85

0.9

0.95

1

12 24.4 38.1 55.4 80.9 119.4 176.5 264.8 420.9 918.5

RMSE vs. #Ratings

Page 41: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

40

6090

12818050

100200

50100

20050

100 200 500

100200 500

50100 200 500 1000 1500

0.875

0.88

0.885

0.89

0.895

0.9

0.905

0.91

10 100 1000 10000 100000

RMSE

Millions of Parameters

Factor models: Error vs. #parameters

NMF

BiasSVD

SVD++

SVD v.2

SVD v.3

SVD v.4

Prize: 0.8563

Netflix: 0.9514

Page 42: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Ratings are not given at random!

Marlin, Zemel, Roweis, Slaney, “Collaborative Filtering and the Missing at Random Assumption” UAI 2007

Yahoo! survey answersYahoo! music ratingsNetflix ratings

Distribution of ratings

Page 43: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

• A powerful source of information:Characterize users by which movies they rated, rather than how they rated

• A dense binary representation of the data:

45531

312445

53432142

24542

522434

42331

users

movies

010100100101

111001001100

011101011011

010010010110

110000111100

010010010101

usersm

ovies

Which movies users rate?

{ } ,ui u iR r= { } ,ui u i

B b=

Page 44: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

The Wisdom of Crowds (of Models)

• All models are wrong; some are useful – G. Box– Some miss strong “local” relationships, e.g., among sequels– Others miss cumulative effect of many small signals– Each complements the other

• Our best entry during Year 1 was a linear combination of 107 sets of predictions

• Our final solution was a linear blend of over 700 prediction sets– Many variations of model structure and parameter settings

• Mega blends are not needed in practice– A handful of simple models achieves 90% of the improvement of

the full blend

Page 45: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

0.865

0.867

0.869

0.871

0.873

0.875

0.877

0.879

0.881

1 8 15 22 29 36 43 50 57

Erro

r -R

MSE

#Predictors

Effect of ensemble size

Page 46: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

Talk announcement

Save the dateMay 24, 11:00AM @ Schreiber

Speaker:Prof. Ricardo Baeza-Yates, Yahoo! VP of Research for Europe and Latin America,

Page 47: Collaborative Filtering Recommendation Techniquessaharon/StatsLearn2011/Yehuda.pdf• Oct’06: Contest began • Oct’07: $50K progress prize for 8.43% improvement • Oct’08:

Research

45531

312445

53432142

24542

522434

42331

34321

454

2434

325

24

Yehuda KorenYahoo! Research

[email protected] homepage:

www.research.att.com/~volinsky/netflix/