Top Banner
Data Mining Techniques Association Rule
27

Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Dec 14, 2015

Download

Documents

Ernest Haswell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Data Mining Techniques Association Rule

Page 2: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

What Is Association Mining?• Association Rule Mining

– Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories

• Applications– Market basket analysis (marketing strategy: items to put

on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc

• Examples– Rule form: Body ead [Support, Confidence].– buys(x, “Computer”) buys(x, “Software”) [2%, 60%]– major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%,

75%]

Page 3: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Market Basket Analysis

Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.

Page 4: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Rule Measures: Support and Confidence

• Let minimum support 50%, and minimum confidence 50%, we have– A C [50%, 66.6%]

– C A [50%, 100%]

Transaction ID Items Bought1000 A,B,C2000 A,C3000 A,D4000 B,E,F

Page 5: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Support & Confidence

Page 6: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Association Rule: Basic Concepts

• Given– (1) database of transactions, – (2) each transaction is a list of items

(purchased by a customer in a visit)

• Find all rules that correlate the presence of one set of items with that of another set of items

• Find all the rules A B with minimum confidence and support– support, s, P(A B)– confidence, c, P(B|A)

Page 7: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Terminologies• Item

– I1, I2, I3, …– A, B, C, …

• Itemset– {I1}, {I1, I7}, {I2, I3, I5}, …– {A}, {A, G}, {B, C, E}, …

• 1-Itemset– {I1}, {I2}, {A}, …

• 2-Itemset– {I1, I7}, {I3, I5}, {A, G}, …

Page 8: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Terminologies

• K-Itemset– If the length of the itemset is K

• Frequent (Large) K-Itemset– If the length of the itemset is K and the itemset

satisfies a minimum support threshold.

• Association Rule– If a rule satisfies both a minimum support thres

hold and a minimum confidence threshold

Page 9: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Analysis• The number of itemsets of a given cardinality

tends to grow exponentially

Page 10: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Fast Algorithms for Mining Association Rules

Page 11: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Mining Association Rules: Apriori Principle

• For rule A C:– support = support({A C}) = 50%– confidence = support({A C})/support({A}) = 66.6%

• The Apriori principle:– Any subset of a frequent itemset must be frequent

Transaction ID Items Bought1000 A,B,C2000 A,C3000 A,D4000 B,E,F

Frequent Itemset Support{A} 75%{B} 50%{C} 50%

{A,C} 50%

Min. support 50%Min. confidence 50%

Page 12: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Mining Frequent Itemsets: the Key Step

• Find the frequent itemsets: the sets of items that

have minimum support

– A subset of a frequent itemset must also be a frequent

itemset

• i.e., if {AB} is a frequent itemset, both {A} and {B} should be a

frequent itemset

– Iteratively find frequent itemsets with cardinality from 1 to

k (k-itemset)

• Use the frequent itemsets to generate

association rules

Page 13: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example Database D1 3 42 3 51 2 3 52 5

scan D

count C1

C1 count1 22 33 34 15 3

generate L1

L1

1 2 3 5

scan D

count C2

C2 count12 113 215 123 225 335 2

generate L2

L2

13232535

C2

121315232535

generate C2

scan D

count C3

C3 count235 2

generate L3L3

235C3

235generate C3

Page 14: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example of Generating Candidates

• L3={abc, abd, acd, ace, bcd}

• Self-joining: L3*L3

– abcd from abc and abd

– acde from acd and ace

• Pruning:

– acde is removed because ade is not in L3

• C4={abcd}

Page 15: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Example

Page 16: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 17: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 18: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Apriori Algorithm

Page 19: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Exercise 4

min-sup = 20%min-conf =

80%

Page 20: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Demo-IBM Intelligent Minner

Page 21: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Demo Database

Page 22: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 23: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 24: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,
Page 25: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Multi-Dimensional Association• Single-Dimensional (Intra-Dimension) Rules: Single

Dimension (Predicate) with Multiple Occurrences.buys(X, “milk”) buys(X, “bread”)

• Multi-Dimensional Rules: 2 Dimensions– Inter-dimension association rules (no repeated predicates)

age(X,”19-25”) occupation(X,“student”) buys(X,“coke”)

– hybrid-dimension association rules (repeated predicates)age(X,”19-25”) buys(X, “popcorn”) buys(X, “coke”)

• Categorical (Nominal) Attributes– finite number of possible values, no ordering among

values

• Quantitative Attributes– numeric, implicit ordering among values

Page 26: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Exercise 5min-sup = 20%min-conf = 80%

Page 27: Data Mining Techniques Association Rule. What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations,

Research Topics• Quantitative Association Rules

– buys (bread, 5) buys (milk, 3)• Weighted Association Rules• High Utility Association Rules• Non-redundant Association Rule• Constrained Association Rules Mining• Multi-dimensional Association Rules• Generalized Association Rules• Negative Association Rules• Incremental Mining Association Rules• Data Stream Association Rule Mining• Interactive Mining Association Rules