8/13/2019 Association Rule Mining Example
1/12
1)
a) Define what is Apriori principle and briefly discuss why Apriori principle is
useful in association rule mining.
Apriori principle is,
If an item set is frequent, then all of its subsets must also frequent, or if an item set
is infrequent then all its supersets must also be infrequent.
Apriori principle reduce the number of candidate item sets in an association rule mining
process by eliminating the candidates that are infrequent and leaving only those that are
frequent. As a result the number of remaining candidate item sets ready for further support
checking becomes much smaller, which dramatically reduce the computation, I/O cost and
memory requirement. Also in Apriori principle support of an item set never exceeds thesupport of its subsets. This is known as the anti-monotone property of support. By using
Apriori principle it avoids the effort wastage of counting the item sets that are known to be
infrequent. Because of these reasons Apriori principle is very useful in association rule
mining.
b) Compare and contrast FP-Growth algorithm with Apriori algorithm.
Main difference between Apriori algorithm and FP-Growth algorithm is,
Apriori algorithm generates candidate item sets and test if they are frequent. But the
FP-Growth algorithm allows frequently item set discovery without candidate item set
generation.
Withal, there are more differences between Apriori algorithm and FB-Growth
algorithm as listed below.
Apriori algorithm FP-Growth algorithm
Use apriori property and join and pruneproperty
It construct conditional frequent patterntree and conditional pattern base from
database which satisfy minimum support
Due to large number of candidates are
generated require large memory space
Due to compact structure and no
candidate generation require less
memory
Multiple scans for generating candidate
sets
Database scanning happens twice only
Execution time is higher than FP-Growth
algorithm as time is wasted in producing
candidates every time
Execution time is less than apriori
algorithm
8/13/2019 Association Rule Mining Example
2/12
2) Consider the market basket transactions given in the following table. Let
min_sup = 40% and min_conf = 40%.
Transaction ID Items Bought
T1 A,B,C
T2 A,B,C,D,E
T3 A,C,D
T4 A,C,D,E
T5 A,B,C,D
a) Find all the frequent item sets using Apriori algorithm.
Min_sup = 40% and min_conf = 40%
Minimum Support = 40% = 2/5
Transaction ID Items Bought
T1 A,B,C
T2 A,B,C,D,E
T3 A,C,DT4 A,C,D,E
T5 A,B,C,D
Item Number of Transactions
A 5
B 3
C 5
D 4
E 2
8/13/2019 Association Rule Mining Example
3/12
Item Pairs Number of Transactions
A,B 3
A,C 5
A,D 4
A,E 2
B,C 3
B,D 2
B,E 1
C,D 4
C,E 2
D,E 2
Item Pairs Number of Transactions
A,B 3
A,C 5
A,D 4
A,E 2
B,C 3
B,D 2
C,D 4
C,E 2D,E 2
{A,B} & {A,C} A,B,C
{A,B} & {A,D} A,B,D
{A,B} & {A,E} A,B,E
{A,C} & {A,D} A,C,D
{A,C} & {A,E} A,C,E
{A,D} & {A,E} A,D,E
{B,C} & {B,D} B,C,D
{C,D} & {C,E} C,D,E
8/13/2019 Association Rule Mining Example
4/12
Item Pairs Number of transactions
A,B,C 3
A,B,D 2
A,B,E 1
A,C,D 4
A,C,E 2
A,D,E 2
B,C,D 2
C,D,E 2
Item Pairs Number of transactionsA,B,C 3
A,B,D 2
A,C,D 4
A,C,E 2
A,D,E 2
B,C,D 2
C,D,E 2
{A,B,C} & {A,B,D} A,B,C,D
{A,C,D} & {A,C,E} A,C,D,E
Item Pairs Number of transactions
A,B,C,D 2
A,C,D,E 2
According to Apriori principle both {A,B,C,D} and {A,C,D,E} sets are bought together
frequently.
8/13/2019 Association Rule Mining Example
5/12
b) Obtain significant decision rules.Subset of {A,B,C,D}
{A} , {B} , {C} , {D} , {A,B} , {A,C} , {A,D} , {B,C} , {B,D} , {C,D} , {A,B,C} , {A,B,D} , {A,C,D} , {B,C,D}
{A} {B,C,D}C = SUP {A,B,C,D} / SUP {A}
= 2/5
= 40%
{B} {A,C,D}C = SUP {A,B,C,D} / SUP {B}
= 2/3
= 66.6%
{C} {A,B,D}C = SUP {A,B,C,D} / SUP {C}
= 2/5
= 40%
{D} {A,B,C}C = SUP {A,B,C,D} / SUP {D}
= 2/4
= 50%
{A,B} {C,D}C = SUP {A,B,C,D} / SUP {A,B}
= 2/3
= 66.6%
{A,C} {B,D}C = SUP {A,B,C,D} / SUP {A,C}
= 2/5
= 40%
{A,D} {B,C}C = SUP {A,B,C,D} / SUP {A,D}
= 2/4
= 50%
8/13/2019 Association Rule Mining Example
6/12
{B,C} {A,D}C = SUP {A,B,C,D} / SUP {B,C}
= 2/3
= 66.6%
{B,D} {A,C}C = SUP {A,B,C,D} / SUP {B,D}
= 2/2
= 100%
{C,D} {A,B}C = SUP {A,B,C,D} / SUP {C,D}
= 2/4
= 50%
{A,B,C} {D}C = SUP {A,B,C,D} / SUP {A,B,C}
= 2/3
= 66.6%
{A,B,D} {C}C = SUP {A,B,C,D} / SUP {A,B,D}
= 2/2
= 100%
{A,C,D} {B}C = SUP {A,B,C,D} / SUP {A,C,D}
= 2/4
= 50%
{B,C,D} {A}C = SUP {A,B,C,D} / SUP {B,C,D}
= 2/2
= 100%
8/13/2019 Association Rule Mining Example
7/12
Rule Confidence
{A} {B,C,D} 40%
{B} {A,C,D} 66.6%
{C} {A,B,D} 40%
{D} {A,B,C} 50%
{A,B} {C,D} 66.6%
{A,C} {B,D} 40%
{A,D} {B,C} 50%
{B,C} {A,D} 66.6%
{B,D} {A,C} 100%
{C,D} {A,B} 50%
{A,B,C} {D} 66.6%
{A,B,D} {C} 100%
{A,C,D} {B} 100%
{B,C,D} {A} 50%
Subset of {A,C,D,E}
{A} , {C} , {D} , {E} , {A,C} , {A,D} , {A,E} , {C,D} , {C,E} , {D,E} , {A,C,D} , {A,C,E} , {A,D,E} , {C,D,E}
{A} {C,D,E}C = SUP {A,C,D,E} / SUP {A}
= 2/5
= 40%
{C} {A,D,E}C = SUP {A,C,D,E} / SUP {C}
= 2/5
= 40%
{D} {A,C,E}C = SUP {A,C,D,E } / SUP {D}
= 2/4
= 50%
{E} {A,C,D}C = SUP { A,C,D,E } / SUP {E}
= 2/2
= 100%
8/13/2019 Association Rule Mining Example
8/12
{A,C} {D,E}C = SUP { A,C,D,E } / SUP {A,C}
= 2/5
= 40%
{A,D} {C,E}C = SUP { A,C,D,E } / SUP {A,D}
= 2/4
= 50%
{A,E} {C,D}C = SUP { A,C,D,E } / SUP {A,E}
= 2/2
= 100%
{C,D} {A,E}C = SUP { A,C,D,E } / SUP {C,D}
= 2/4
= 50%
{C,E} {A,D}C = SUP { A,C,D,E } / SUP {C,E}
= 2/2
= 100%
{D,E} {A,C}C = SUP { A,C,D,E } / SUP {D,E}
= 2/2
= 100%
{A,C,D} {E}C = SUP { A,C,D,E } / SUP {A,C,D}
= 2/4
= 50%
{A,C,E} {D}C = SUP { A,C,D,E } / SUP {A,C,E}
= 2/2
= 100%
8/13/2019 Association Rule Mining Example
9/12
{A,D,E} {C}C = SUP { A,C,D,E } / SUP {A,D,E}
= 2/2
= 100%
{C,D,E} {A}C = SUP { A,C,D,E } / SUP {C,D,E}
= 2/2
= 100%
Rule Confidence
{A} {C,D,E} 40%
{C} {A,D,E} 40%
{D} {A,C,E} 50%
{E} {A,C,D} 100%
{A,C} {D,E} 40%
{A,D} {C,E} 50%
{A,E} {C,D} 100%
{C,D} {A,E} 50%
{C,E} {A,D} 100%
{D,E} {A,C} 100%
{A,C,D} {E} 50%
{A,C,E} {D} 100%{A,D,E} {C} 100%
{C,D,E} {A} 100%
8/13/2019 Association Rule Mining Example
10/12
c) Derive the FP-Tree for the above transaction table.
Transaction ID Items Bought
T1 A,B,CT2 A,B,C,D,E
T3 A,C,D
T4 A,C,D,E
T5 A,B,C,D
Support for each item sets
A = 5/5 = 100%
B = 3/5 = 60%
C = 5/5 = 100%
D = 4/5 = 80%
E = 2/5 = 40%
According to Support
A,C,D,B,E
Re-arrange the table
Transaction ID Items Bought
T1 A,C,B
T2 A,C,D.B,E
T3 A,C,DT4 A,C,D,E
T5 A,C,D,B
8/13/2019 Association Rule Mining Example
11/12
FP-Tree
After TID T1
After TID T2
After TID T3
A1
Null
C1
B1
Null
B1
A2
C2
D1
B1
E1
Null
B1
A3
C3
D2
B1
E1
8/13/2019 Association Rule Mining Example
12/12
After TID T4
After TID T5
E1
Null
B1
A4
C4
D3
B1
E1
E1
Null
B1
A5
C5
D4
B2
E1