Rule Mining

7/29/2019 Rule Mining

1/20

Association Rule Mining


2/20

2

The Task Two ways of defining the task

General Input: A collection of instances Output: rules to predict the values of any attribute(s) (not

just the class attribute) from values of other attributes E.g. if temperature = cool then humidity =normal If the right hand side of a rule has only the class attribute,

then the rule is a classification rule Distinction: Classification rules are applied together as sets of

rules

Specific - Market-basket analysis Input: a collection of transactions

Output: rules to predict the occurrence of any item(s) fromthe occurrence of other items in a transaction E.g. {Milk, Diaper} -> {Beer}

General rule structure: Antecedents -> Consequents


3/20

Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 #

Association Rule Mining

Given a set of transactions, find rules that will predict the

occurrence of an item based on the occurrences of otheritems in the transaction

Market-Basket transactions

TID Items

1 Bread, Milk2 Bread, Diaper, Beer, Eggs3 Milk, Diaper, Beer, Coke4 Bread, Milk, Diaper, Beer5 Bread, Milk, Diaper, Coke

Example of Association Rules

{Diaper} {Beer},

{Milk, Bread} {Eggs,Coke},

{Beer, Bread} {Milk},

Implication means co-occurrence,

not causality!


4/20


Definition: Frequent Itemset

Itemset

A collection of one or more itemsExample: {Milk, Bread, Diaper}

k-itemset

An itemset that contains k items

Support count ()

Frequency of occurrence of anitemset

E.g. ({Milk, Bread,Diaper}) = 2

Support

Fraction of transactions that containan itemset

E.g. s({Milk, Bread, Diaper}) = 2/5

Frequent Itemset

An itemset whose support is greaterthan or equal to a minsup threshold

TID Items

1 Bread, Milk2 Bread, Diaper, Beer, Eggs3 Milk, Diaper, Beer, Coke

4 Bread, Milk, Diaper, Beer5 Bread, Milk, Diaper, Coke


5/20


Definition: Association Rule

Example:

Beer}Diaper,Milk{

4.052

|T|)BeerDiaper,,Milk( s

67.03

2

)Diaper,Milk(

)BeerDiaper,Milk,(

c

Association Rule

An implication expression of the formX Y, where X and Y are itemsets

Example:

{Milk, Diaper} {Beer}

Rule Evaluation Metrics

Support (s)

Fraction of transactions that contain

both X and Y

Confidence (c)

Measures how often items in Yappear in transactions that

contain X

TID Items

1 Bread, Milk

2 Bread, Diaper, Beer, Eggs

3 Milk, Diaper, Beer, Coke

4 Bread, Milk, Diaper, Beer

5 Bread, Milk, Diaper, Coke


6/20


Association Rule Mining Task

Given a set of transactions T, the goal of

association rule mining is to find all rules having

support minsup threshold

confidence minconfthreshold

Brute-force approach:

List all possible association rules

Compute the support and confidence for each rule

Prune rules that fail the minsup and minconf

thresholds

Computationally prohibitive!


7/20


Mining Association Rules

Example of Rules:{Milk,Diaper} {Beer} (s=0.4, c=0.67)

{Milk,Beer} {Diaper} (s=0.4, c=1.0)

{Diaper,Beer} {Milk} (s=0.4, c=0.67)

{Beer} {Milk,Diaper} (s=0.4, c=0.67)

{Diaper} {Milk,Beer} (s=0.4, c=0.5){Milk} {Diaper,Beer} (s=0.4, c=0.5)

TID Items



Observations:

All the above rules are binary partitions of the same itemset:

{Milk, Diaper, Beer}

Rules originating from the same itemset have identical support but

can have different confidence

Thus, we may decouple the support and confidence requirements


8/20


Mining Association Rules

Two-step approach:

1. Frequent Itemset Generation

Generate all itemsets whose support minsup

2. Rule Generation Generate high confidence rules from each frequent itemset,

where each rule is a binary partitioning of a frequent itemset

Frequent itemset generation is stillcomputationally expensive


9/20

9

Power set Given a set S, power set, P is the set of all

subsets of S Known property of power sets If S has n number of elements, P will have N = 2n

number of elements.

Examples: For S = {}, P={{}}, N = 20 = 1 For S = {Milk}, P={{}, {Milk}}, N=21=2 For S = {Milk, Diaper} P={{},{Milk}, {Diaper}, {Milk, Diaper}}, N=22=4

For S = {Milk, Diaper, Beer}, P={{},{Milk}, {Diaper}, {Beer}, {Milk, Diaper},

{Diaper, Beer}, {Beer, Milk}, {Milk, Diaper, Beer}},N=23=8

B F h


10/20

10

Brute Force approach toFrequent Itemset Generation

For an itemset with 3elements, we have 8 subsets Each subset is a candidate

frequent itemset whichneeds to be matched againsteach transaction

TID Items



Itemset Count

{Milk} 4

{Diaper} 4

{Beer} 3

1-itemsets

2-itemsets

Itemset Count

{Milk, Diaper} 3

{Diaper, Beer} 3

{Beer, Milk} 23-itemsets

Itemset Count

{Milk, Diaper, Beer} 2

Important Observation: Counts of subsets cant be smaller than the count of an itemset!


11/20


Reducing Number of Candidates

Apriori principle:

If an itemset is frequent, then all of its subsets must also

be frequent

Apriori principle holds due to the following propertyof the support measure:

Support of an itemset never exceeds the support of its

subsets

This is known as the anti-monotone property of support

)()()(:, YsXsYXYX


12/20


Illustrating Apriori Principle

Item Count

Bread 4Coke 2Milk 4Beer 3Diaper 4Eggs 1

Itemset Count

{Bread,Milk} 3{Bread,Beer} 2{Bread,Diaper} 3

{Milk,Beer} 2{Milk,Diaper} 3{Beer,Diaper} 3

Itemset Count

{Bread,Milk,Diaper} 3

Items (1-itemsets)

Pairs (2-itemsets)

(No need to generatecandidates involving Coke

or Eggs)

Triplets (3-itemsets)Minimum Support = 3

If every subset is considered,6C1 +6C2 +

6C3 = 41With support-based pruning,

6 + 6 + 1 = 13Write all possible 3-itemsets

and prune the list based on infrequent 2-itemsets


13/20


Apriori Algorithm

Method:

Let k=1

Generate frequent itemsets of length 1

Repeat until no new frequent itemsets are identifiedGenerate length (k+1) candidate itemsets from length kfrequent itemsets

Prune candidate itemsets containing subsets of length k thatare infrequent

Count the support of each candidate by scanning the DBEliminate candidates that are infrequent, leaving only thosethat are frequent

Note: This algorithm makes several passes over the transaction list


14/20


Rule Generation

Given a frequent itemset L, find all non-empty subsets f

L such that f L f satisfies the minimum confidencerequirement

If {A,B,C,D} is a frequent itemset, candidate rules:

ABC D, ABD C, ACD B, BCD A,A BCD, B ACD, C ABD, D ABC

AB CD, AC BD, AD BC, BC AD,BD AC, CD AB,

If |L| = k, then there are 2k 2 candidate association rules(ignoring L and L)

Because, rules are generated from frequent itemsets,they automatically satisfy the minimum support threshold

Rule generation should ensure production of rules that satisfyonly the minimum confidence threshold


15/20


Rule Generation

How to efficiently generate rules from frequent itemsets?

In general, confidence does not have an anti-monotone property

c(ABC D) can be larger or smaller than c(AB D)

But confidence of rules generated from the same itemset has ananti-monotone property

e.g., L = {A,B,C,D}:

c(ABC D) c(AB CD) c(A BCD)

Confidence is anti-monotone w.r.t. number of items on the RHS of

the rule

Example: Consider the following two rules

{Milk} {Diaper, Beer} has cm=0.5

{Milk, Diaper} {Beer} has cmd=0.67 > cm


16/20

16

Computing Confidence for Rules

Unlike computing support,computing confidence doesnot require several passesover the transaction list

Supports computed from

frequent itemset generationcan be reused Tables on the side show

support values for all (exceptnull) the subsets of itemset{Bread, Milk, Diaper}

Confidence values are shownfor the (23-2) = 6 rulesgenerated from thisfrequent itemset

Itemset Support

{Milk} 4/5=0.8

{Diaper} 4/5=0.8

{Bread} 4/5=0.8

Example of Rules:

{Bread, Milk} {Diaper} (s=0.6, c=1.0)

{Milk, Diaper} {Bread} (s=0.6, c=1.0)

{Diaper, Bread} {Milk} (s=0.6, c=1.0){Bread} {Milk, Diaper} (s=0.6, c=0.75)

{Diaper} {Milk, Bread} (s=0.6, c=0.75)

{Milk} {Diaper, Bread} (s=0.6, c=0.75)

Itemset Support{Milk, Diaper} 3/5=0.6

{Diaper, Bread} 3/5=0.6

{Bread, Milk} 3/5=0.6

Itemset Support

{Bread, Milk, Diaper} 3/5=0.6


17/20

17

Rule generation in AprioriAlgorithm

For each frequent k-itemset where k > 2 Generate high confidence

rules with one item in theconsequent

Using these rules,iteratively generate highconfidence rules withmore than one items inthe consequent

if any rule has lowconfidence then all theother rules containing theconsequents can bepruned (not generated)

{Bread, Milk, Diaper}1-item rules

{Bread, Milk} {Diaper}{Milk, Diaper} {Bread}{Diaper, Bread} {Milk}

2-item rules{Bread} {Milk, Diaper}{Diaper} {Milk, Bread}

{Milk} {Diaper, Bread}


18/20

18

Evaluation

Support and confidence used by Apriori allow a lot ofrules which are not necessarily interesting

Two options to extract interesting rules Using subjective knowledge

Using objective measures (measures better than confidence) Subjective approaches

Visualization users allowed to interactively verify thediscovered rules

Template-based approach filter out rules that do not fit

the user specified templates Subjective interestingness measure filter out rules that

are obvious (bread -> butter) and that are non-actionable (donot lead to profits)


19/20


Drawback of Confidence

Coffee Coffee

Tea 15 5 20

Tea 75 5 80

90 10 100

Association Rule: Tea Coffee

Confidence= P(Coffee|Tea) = 0.75

but P(Coffee) = 0.9

Although confidence is high, rule is misleading

P(Coffee|Tea) = 0.9375


20/20

20

Objective Measures

Confidence estimates rule quality in terms of theantecedent support but not consequent support As seen on the previous slide, support for consequent

(P(coffee)) is higher than the rule confidence(P(coffee/Tea))

Weka uses other objective measures Lift (A->B) = confidence(A->B)/support(B) = support(A-

>B)/(support(A)*support(B)) Leverage (A->B) = support(A->B) support(A)*support(B)

Conviction(A->B) = support(A)*support(not B)/support(A->B) conviction inverts the lift ratio and also computes support

for RHS not being true

Rule Mining

Documents