Recommender Systems Based Rajaraman and Ullman: Mining Massive Data Sets & Francesco Ricci et al. Recommender Systems Handbook.
Feb 25, 2016
Recommender Systems
Based Rajaraman and Ullman: Mining Massive Data Sets &
Francesco Ricci et al. Recommender Systems Handbook.
Recommender System
All of th
ese thriv
e on
User Generat
ed Content (UGC)!
Recommender System
Central Theme :o Predict ratings for unrated itemso Recommend top-k items
RS – Major Approaches
• Basic question: Given (highly incomplete/sparse), given predict
1 3 5
1 4 4
4 2 3
3 5 4
4 4 3
RS – Approaches
• Content-based: how similar is to items has rated/liked in the past?– Use metadata for measuring similarity. + works even when no ratings available on affected items. - Requires metadata!
• Collaborative Filtering: Identify items (users) with their rating vector; no need for metadata; but cold-start is a problem.
RS – Approaches • CF can be memory-based (as sketched on p5): item ’s
“characteristics captured by the ratings it has received (rating vector).
• Or it can be model-based: model user/item’s behavior via latent factors (to be learned from data). – Dimensionality reduction – Original ratings matrix is usually (very) low rank. Matrix completion:
• using Singular value decomposition (SVD). • Using matrix factorization (MF) [and variants].
• MovieLens – example of RS using CF.
Collaborative Filtering
Key concepts/questions
• How is user f/b expressed: ratings or implicit? • How to measure similarity? • How many nearest neighbors to pick (if
memory- or neighborhood-based). • How to predict unknown ratings? • Distinguished (also called active) user and
(target) item.
A Naïve Algorithm (memory-based)
• Find top- most similar neighbors to distinguished user (using chosen similarity or proximity measure).
• item rated by sufficiently many of these, compute by aggregating by chosen neighbors above.
• Sort items with predicted ratings and recommend top- items to
An Example 4 5 1 5 5 4 2 4 5 3 3
• Jaccard(A,B) = 1/5 <2/4 = Jaccard(A,C)! • – OK, but ignores internal “rating scales” easy/hard
graders. • See the Rajaraman et al. book for “rounded” Jaccard/Cosine. • A more principled approach: subtract from each rating the
corresponding user’s mean rating, then apply Jaccard/cosine.
An Example
2/3 5/3 -7/3 1/3 1/3 -2/3
-5/3 1/3 4/3 0 0
• See what just happened to the ratings! • Behavior and items more well-separated. • Cosine can now be + or -: check (A,B) and
(A,C).
Prediction using Memory/Neighborhood-based approaches
• A popular approach – using Pearson correlation coefficient.
• where• i.e., cosine of the “vectors of deviations from
the mean”. • – normalization factor = • See the RecSys handbook and [Adomavicius
and Tuzhilin TKDE 2005 for alternatives.
User-User vs Item-Item.
• User-User CF: what we just discussed! • Item-Item – dual in principle: find items most
similar to distinguished item ; for every user who did not rate the distinguished item but rated sufficiently many from the similarity group, compute
• In practice, item-item has been found to be better than user-user.
Simpler Alternatives for Rating Estimation • Simple average of ratings by most similar neighbors. • Weighted average. • User’s mean plus offset corresponding to weighted
average of offsets by most similar neighbors (Pearson!).
• Or you can see the popular vote by most similar neighbors: e.g., has 5 most similar neighbors who have rated . – rated 1; rated 3; rated 4; rated 5. – Simple majority: – Suppose 1.0. Then ie-breaking arbitrary.
Item-based CF • Dual to user-based CF, in principle. • “People who bought also bought ”. • Natural connection to association rules (each user = a
transaction). • Predict unknown rating of user on item as the aggregate of
ratings by on items similar to • E.g., using mean-centering and Pearson correlation for
item-item similarity,
where mean rating of by various users and similarity b/w and and – the usual normalization factor.
Item-based CF Computation Illustrated • Similarities: computing sim. b/w all pairs of items is prohibitive! • But do we need to? • How efficiently can we compute the sim. of all pairs of items for which
the sim. Is positive?
X
X
X
X
𝑖
𝑢
…
Item-based CF – Recommendation Generation
X
X
X
X
𝑖
𝑢 X X X X X similar items?similar items?
How efficiently can we generate recommendations for a given user?
Some empirical facts re. user-based vs. item-based CF
• User profiles are typically thinner than item profiles; depends on application domain. – Certainly holds for movies (Netflix).
• as users provide more ratings, user-user sim. can chage more dyamically than item-item sim.
• Can we precompute item-item sim. and speed up prediction computation?
• What about refreshing sim. against updates? Can we do it incrementally? How often should we do this?
• Why not do this for user-user?
User & Item-based CF are both personalized
• Non-personalized would estimate an unknown rating as a global average.
• Every user gets the same recommendation list, modulo items s/he may have already rated.
• Personalized clearly leads to better predictions.