FUNGSI MAYOR Assosiation
Dec 14, 2015
What Is Association Mining?
• Association rule mining:– Finding frequent patterns, associations,
correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.
• Applications:– Basket data analysis, cross-marketing, catalog
design, loss-leader analysis, clustering, classification, etc.
• Examples. – Rule form: “Body ® Head [support,
confidence]”.– buys(x, “diapers”) ® buys(x, “beers”) [0.5%, 60%]
Rule Measures: Support and Confidence
• Find all the rules X & Y Z with minimum confidence and support– support, s, probability that a
transaction contains {X Y Z}
– confidence, c, conditional probability that a transaction having {X Y} also contains Z
Transaction ID Items Bought2000 A,B,C1000 A,C4000 A,D5000 B,E,F
Let minimum support 50%, and minimum confidence 50%, we have
A C (50%, 66.6%)C A (50%, 100%)
Customerbuys diaper
Customerbuys both
Customerbuys beer
Association Rule Mining• Given a set of transactions, find rules that will
predict the occurrence of an item based on the occurrences of other items in the transaction
Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Example of Association Rules
{Diaper} {Beer},{Milk, Bread} {Eggs,Coke},{Beer, Bread} {Milk},
Definition: Frequent Itemset• Itemset
– A collection of one or more items• Example: {Milk, Bread, Diaper}
– k-itemset• An itemset that contains k items
• Support count ()– Frequency of occurrence of an
itemset– E.g. ({Milk, Bread,Diaper}) = 2
• Support– Fraction of transactions that
contain an itemset– E.g. s({Milk, Bread, Diaper}) =
2/5• Frequent Itemset
– An itemset whose support is greater than or equal to a minsup threshold
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Definition: Association RuleExample:
Beer}Diaper,Milk{
4.052
|T|)BeerDiaper,,Milk( s
67.032
)Diaper,Milk()BeerDiaper,Milk,(
c
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Example of Rules:
{Milk,Beer} {Diaper}{Diaper,Beer} {Milk}{Beer} {Milk,Diaper} {Diaper} {Milk,Beer} {Milk} {Diaper,Beer}
Definition: Association RuleExample:
Beer}Diaper,Milk{
4.052
|T|)BeerDiaper,,Milk( s
67.032
)Diaper,Milk()BeerDiaper,Milk,(
c
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Example of Rules:
{Milk,Beer} {Diaper}{Diaper,Beer} {Milk}{Beer} {Milk,Diaper} {Diaper} {Milk,Beer} {Milk} {Diaper,Beer}
(s=0.4, c=1.0) (s=0.4, c=0.67) (s=0.4, c=0.67) (s=0.4, c=0.5) (s=0.4, c=0.5)
The Apriori Algorithm — Example
TID Items100 1 3 4200 2 3 5300 1 2 3 5400 2 5
Database D itemset sup.{1} 2{2} 3{3} 3{4} 1{5} 3
itemset sup.{1} 2{2} 3{3} 3{5} 3
Scan D
C1L1
itemset{1 2}{1 3}{1 5}{2 3}{2 5}{3 5}
itemset sup{1 2} 1{1 3} 2{1 5} 1{2 3} 2{2 5} 3{3 5} 2
itemset sup{1 3} 2{2 3} 2{2 5} 3{3 5} 2
L2
C2 C2
Scan D
C3 L3Scan D itemset sup{2 3 5} 2
itemset{1 3 5}{2 3 5}
Algoritma Asosiasi MBA (Market Basket Analysis)Langkah-langkah algoritma MBA:1. Tetapkan besaran dari konsep itemset sering,
nilai minimum besaran support dan besaran confidence yang diinginkan.
2. Menetapkan semua itemset sering, yaitu itemset yang memiliki frekuensi itemset minimal sebesar bilangan sebelumnya.
3. Dari semua itemset sering, hasilkan aturan asosiasi yang memenuhi nilai minimum support dan confidence
Support (AB) = P(AB)
Confidence(AB) = P(B|A)
tuplesofnumber total
B andA both containing tuplesofnumber B)support(A
A containing tuplesofnumber
B andA both containing tuplesofnumber B)(Aconfidence
unt(A)support_co
B)unt(Asupport_coA)|P(BB)(Aconfidence