This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
04/18/13 Data Mining: Principles and Algorithms1
Data Mining: Concepts and Techniques
— Chapter 11 —
Addit ional Theme: Collaborative Filtering & Data Mining
Jiawei Han and Micheline KamberDepartment of Computer Science
University of Illinois at Urbana-Champaignwww.cs.uiuc.edu/~hanj
Motivation Systems in Action A Conceptual Framework User-User Methods Item-Item Methods Recent Advances and Open Problems
04/18/13 Data Mining: Principles and Algorithms4
Motivation
User Perspective Lots of online products, books, movies, etc. Reduce my choices…please…
Manager Perspective
“ if I have 3 million customers on the web, I should have 3 million stores on the web.”
CEO of Amazon.com [SCH01]
04/18/13 Data Mining: Principles and Algorithms5
Example: Recommendation
Customers who bought this book also bought:
•Data Preparation for Data Mining: by Dorian Pyle (Author) •The Elements of Statistical Learning: by T. Hastie, et al •Data Mining: Introductory and Advanced Topics: by Margaret H. Dunham•Mining the Web: Analysis of Hypertext and Semi Structured Data
04/18/13 Data Mining: Principles and Algorithms6
Example: Personalization
04/18/13 Data Mining: Principles and Algorithms7
Other Examples
Movielens: movies Moviecritic: movies again My launch: music Gustos starrater: web pages Jester: Jokes TV Recommender: TV shows Suggest 1.0 : different products And much more…
• Input : product taxonomy•Output: modified taxonomy with even distribution
04/18/13 Data Mining: Principles and Algorithms24
Adjusted Product Taxonomy (2)
Using original taxonomy
Using adjusted taxonomy
Number of transactions having this category
04/18/13 Data Mining: Principles and Algorithms25
Latent Semantic Indexing [SAR00b]
=R
m X n
U
m X r
S
r X r
I’
r X n
Sk
k X k
Uk
m X k
Ik’
k X n
The reconstructed matrix Rk = Uk.Sk.Ik’ is the closest rank-k matrix to the original matrix R.
Rk
• Captures latent associations• Reduced space is less-noisy
04/18/13 Data Mining: Principles and Algorithms26
Are We Done? (2)
Q2:How to Select Neighbors? We don’t expect to use the same neighbors
for all products Neighbors should be product-category
specific
Not adequately answered
Q2-B. How can we determine whether or not a user is relevant to a given product?
04/18/13 Data Mining: Principles and Algorithms27
Selecting Relevant Instances [YU01]
Superman and Batman and correlated Titanic and Batman are negatively correlated “Dances with Wolves” has nothing to do with Batman’s rating Karen is not a good instance to consider
MI(X;Y) = H(X) – H(X|Y)
How can we formalize this? Mutual Information
Predict this
04/18/13 Data Mining: Principles and Algorithms28
Selecting Relevant Instances (2)
Offline phase: Estimate mutual information between items For each item:
Find users who rated it Compute their strength (how many relevant items
they also rated) Retain subset of them (10% works fine)
Online phase: To predict the target item’s rating, run k-NN on
its reduced instance space
Better results with less data… quality not quantity is what matter
04/18/13 Data Mining: Principles and Algorithms29
Are We Done? (3)
Q3:How to combine? Weighted average Discover association rules in neighbors’ transactions
[LEE01, WAN04] For every x in this group: like(x, Item1) ^ like(x, Item2) like(x, Item3) Use confidence and support to judge the quality of the
prediction Prediction is done on the binary level (like, dislike) Costly to run online
04/18/13 Data Mining: Principles and Algorithms30
User-User Methods Evaluation
Achieve good quality in practice The more processing we push offline, the better
the method scale However:
User preference is dynamic High update frequency of offline-calculated
Item-Item Method Identify buying patterns Correlation Analysis Linear Regression Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms32
Item-Item Similarity: The Intuit ion
Search for similarities among items All computations can be done offline Item-Item similarity is more stable that user-user
similarity No need for frequent updates
First Order Models Correlation Analysis Linear Regression
Higher Order Models Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms33
Correlation-based Methods [SAR01]
Same as in user-user similarity but on item vectors Pearson correlation coefficient
Look for users who rated both items
u1
um
i1 ii ij in
∑∑∑
∈∈
∈
−−
−−=
ItemsBoth Rated Usersu
2
ItemsBoth Rated Usersu
2
ItemsBoth Rated Usersu
)()(
))((
iuijuj
iuijuj
ijrrrr
rrrrs
04/18/13 Data Mining: Principles and Algorithms34
Correlation-based Methods (2)
Offline phase: Calculate n(n-1) similarity measures For each item
Determine its k-most similar items Online phase:
Predict rating for a given user-item pair as a weighted sum over similar items that he rated
Ua ?2 3 4
∑∑
∈
∈=
itemssimilariij
aiitemssimilariij
aj s
rsr
j
04/18/13 Data Mining: Principles and Algorithms35
Regression Based Methods [VUC00]
Offline phase: Fit n(n-1) linear regressions Fij(x) is a linear transformation of a user rating on
item i to his rating on item j Online phase
Same as previous method The weights are inversely proportional to the
regression error rates
∑∑
∈
∈=
aby items
aby items
)(
ratediij
airatedi
ijij
aj w
rfw
r
04/18/13 Data Mining: Principles and Algorithms36
Higher Order Models
Previous approaches used the Naïve Bayes assumption Item effects on a given one are independent
Not always true Higher order models can do better
Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms37
Bayesian Belief Network: introduction
Bayesian belief network allows a subset of the variables to be conditionally independent
A graphical model of causal relationships Represents dependency among the variables Gives a specification of joint probability distribution
X Y
ZP
Nodes: random variablesLinks: dependencyX,Y are the parents of Z, and Y is the parent of PNo dependency between Z and PHas no loops or cycles
04/18/13 Data Mining: Principles and Algorithms38
Bayesian Belief Network: An Example
FamilyHistory
LungCancer
PositiveXRay
Smoker
Emphysema
Dyspnea
LC
~LC
(FH, S) (FH, ~S) (~FH, S) (~FH, ~S)
0.8
0.2
0.5
0.5
0.7
0.3
0.1
0.9
Bayesian Belief Networks
The conditional probability table for the variable LungCancer:Shows the conditional probability for each possible combination of its parents
∏=
=n
iZParents iziPznzP
1))(|(),...,1(
04/18/13 Data Mining: Principles and Algorithms39
Belief Network for CF [BRE98]
Every item is a node Binary rating (like, dislike) Learn offline a belief network over the training date CPT table at each node is represented as a decision tree Use greedy algorithms to determine the best network
structure Use probabilistic inference for online prediction
04/18/13 Data Mining: Principles and Algorithms40
Belief Network for CF: An Example
decision tree for the random variable “Melrose Palace” in the movie domain
Probability
Friends
M.P
B.H CPT
04/18/13 Data Mining: Principles and Algorithms41
Association Rule Mining
Offline processing Work on the binary level (like, dislike) View user as market basket containing items
liked by user Discover association rules between items
Online processing: Match items that the active user like with rules
left hand side Recommend rules’ consequent based on
support and confidence
04/18/13 Data Mining: Principles and Algorithms42
Association Rule Mining : Problems
High support threshold leads to low coverage and may eliminate important, but infrequent items from consideration
Low support thresholds result in very large model sizes, computationally expensive offline pattern discovery phase and slower online matching phase
Solution: Adaptive Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms43
Adaptive Association Rule Mining [LIN01]
minSupport
minConfidenceDesired number
of rules
Given: transaction dataset target item desired range for number of
rules specified minimum confidence
Find: set S of association rules for target item such that number of rules in S is in given range rules in S satisfy minimum confidence constraint rules in S have higher support than rules not in S that satisfy above
constraints
04/18/13 Data Mining: Principles and Algorithms44
Adaptive Association Rule Mining (2)
Discover rules with one item on the head Like (x, item1) ^ Like (x, item2) Like(x,
target)
The miner discovers association rules iteratively (for each target item) until the desired number of rules are extracted
Support is adjusted per-item
04/18/13 Data Mining: Principles and Algorithms45
Item-Item Methods: Why It Works?
Like(x,Book1)^like(x,book2) like(x,book3)
Like(x,Movie1) like(x,Movie2)
Support Support
We use the right neighbors for each item
Without discovering the groups themselves thus eliminating costly online matching
In general better quality than user-user methods and better response time [LIN03]
Book1, Book2Movie1
Bookgang
Moviegang
04/18/13 Data Mining: Principles and Algorithms46
Recent Work and Open Problems
Order-based methods Ordering items is more informative than rating them [KAM03] developed k-o’mean to work on orders
Preference-based methods Total ordering of items is not feasible Work on partial orders (preferences) [COH99]
Integrating background knowledge User demographic information, item-features, etc..
Modeling time Sequential patterns
04/18/13 Data Mining: Principles and Algorithms47
References (1) Charu C. Aggarwal, Joel L. Wolf, Kun-Lung Wu, Philip S. Yu: Horting Hatches
an Egg: A New Graph-Theoretic Approach to Collaborative Filtering. KDD 1999: 201-212
J. Breese, D. Heckerman, C. Kadie Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proc. 14th Conf. Uncertainty in Artificial Intelligence, Madison, July 1998.
Yoon Ho Cho and Jae Kyeong Kim: Application of Web usage mining and product taxonomy to collaborative recommendations in e-commerce. Expert Systems with Applications, 26(2), 2003
William W. Cohen, Robert E. Schapire, and Yoram Singer. Learning to order things. In Advances in Neural Processing Systems 10, Denver, CO, 1997
Jiawe Han, Fall 2003 online course notes available at: http://www-courses.cs.uiuc.edu/~cs397han/slides/05.ppt Toshihiro Kamishima: Nantonac collaborative filtering: recommendation
based on order responses. KDD 2003: 583-588 Lee, C.-H, Kim, Y.-H., Rhee, P.-K. Web personalization expert with combining
collaborative filtering and association rule mining technique. Expert Systems with Applications, v 21, n 3, October, 2001, p 131-137
04/18/13 Data Mining: Principles and Algorithms48
References (2) W. Lin, 2001P, online presentation available at: http://www.wiwi.hu-
Weiyang Lin, Sergio A. Alvarez, and Carolina Ruiz. Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6:83--105, 2002
G. Linden, B. Smith, and J. York, "Amazon.com Recommendations Iemto -item collaborative filtering", IEEE Internet Computing, Vo. 7, No. 1, pp. 7680, Jan. 2003. Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Riedl: Analysis of recommendation algorithms for e-commerce. ACM Conf. Electronic Commerce 2000: 158-167
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl: Application of dimensionality reduction in recommender systems--a case study. In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000.
B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. WWW’01
04/18/13 Data Mining: Principles and Algorithms49
References (3) B. Sarwar, 2000P, online presentation available at: http://www.wiwi.hu-
berlin.de/~myra/WEBKDD2000/WEBKDD2000_ARCHIVE/badrul.ppt J. Ben Schafer, Joseph A. Konstan, John Riedl: E-Commerce
Recommendation Applications. Data Mining and Knowledge Discovery 5(1/2): 115-153, 2001
L.H. Ungar and D.P. Foster: Clustering Methods for Collaborative Filtering, AAAI Workshop on Recommendation Systems, 1998.
Yi-Fan Wang, Yu-Liang Chuang, Mei-Hua Hsu and Huan-Chao Keh: A personalized recommender system for the cosmetic business. Expert Systems with Applications, v 26, n 3, April, 2004 Pages 427-434
S. Vucetic and Z. Obradovic. A regression-based approach for scaling-up personalized recommender systems in e-commerce. In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000.
Kai Yu, Xiaowei Xu, Martin Ester, and Hans-Peter Kriegel: Selecting relevant instances for efficient accurate collaborative filtering. In Proceedings of the 10th CIKM, pages 239--246. ACM Press, 2001.
Cheng Zhai, Spring 2003 online course notes available at: http://sifaka.cs.uiuc.edu/course/2003-497CXZ/loc/cf.ppt