Top Banner

of 12

Association Rule Mining Example

Jun 04, 2018

Download

Documents

sanjeewascribd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/13/2019 Association Rule Mining Example

    1/12

    1)

    a) Define what is Apriori principle and briefly discuss why Apriori principle is

    useful in association rule mining.

    Apriori principle is,

    If an item set is frequent, then all of its subsets must also frequent, or if an item set

    is infrequent then all its supersets must also be infrequent.

    Apriori principle reduce the number of candidate item sets in an association rule mining

    process by eliminating the candidates that are infrequent and leaving only those that are

    frequent. As a result the number of remaining candidate item sets ready for further support

    checking becomes much smaller, which dramatically reduce the computation, I/O cost and

    memory requirement. Also in Apriori principle support of an item set never exceeds thesupport of its subsets. This is known as the anti-monotone property of support. By using

    Apriori principle it avoids the effort wastage of counting the item sets that are known to be

    infrequent. Because of these reasons Apriori principle is very useful in association rule

    mining.

    b) Compare and contrast FP-Growth algorithm with Apriori algorithm.

    Main difference between Apriori algorithm and FP-Growth algorithm is,

    Apriori algorithm generates candidate item sets and test if they are frequent. But the

    FP-Growth algorithm allows frequently item set discovery without candidate item set

    generation.

    Withal, there are more differences between Apriori algorithm and FB-Growth

    algorithm as listed below.

    Apriori algorithm FP-Growth algorithm

    Use apriori property and join and pruneproperty

    It construct conditional frequent patterntree and conditional pattern base from

    database which satisfy minimum support

    Due to large number of candidates are

    generated require large memory space

    Due to compact structure and no

    candidate generation require less

    memory

    Multiple scans for generating candidate

    sets

    Database scanning happens twice only

    Execution time is higher than FP-Growth

    algorithm as time is wasted in producing

    candidates every time

    Execution time is less than apriori

    algorithm

  • 8/13/2019 Association Rule Mining Example

    2/12

    2) Consider the market basket transactions given in the following table. Let

    min_sup = 40% and min_conf = 40%.

    Transaction ID Items Bought

    T1 A,B,C

    T2 A,B,C,D,E

    T3 A,C,D

    T4 A,C,D,E

    T5 A,B,C,D

    a) Find all the frequent item sets using Apriori algorithm.

    Min_sup = 40% and min_conf = 40%

    Minimum Support = 40% = 2/5

    Transaction ID Items Bought

    T1 A,B,C

    T2 A,B,C,D,E

    T3 A,C,DT4 A,C,D,E

    T5 A,B,C,D

    Item Number of Transactions

    A 5

    B 3

    C 5

    D 4

    E 2

  • 8/13/2019 Association Rule Mining Example

    3/12

    Item Pairs Number of Transactions

    A,B 3

    A,C 5

    A,D 4

    A,E 2

    B,C 3

    B,D 2

    B,E 1

    C,D 4

    C,E 2

    D,E 2

    Item Pairs Number of Transactions

    A,B 3

    A,C 5

    A,D 4

    A,E 2

    B,C 3

    B,D 2

    C,D 4

    C,E 2D,E 2

    {A,B} & {A,C} A,B,C

    {A,B} & {A,D} A,B,D

    {A,B} & {A,E} A,B,E

    {A,C} & {A,D} A,C,D

    {A,C} & {A,E} A,C,E

    {A,D} & {A,E} A,D,E

    {B,C} & {B,D} B,C,D

    {C,D} & {C,E} C,D,E

  • 8/13/2019 Association Rule Mining Example

    4/12

    Item Pairs Number of transactions

    A,B,C 3

    A,B,D 2

    A,B,E 1

    A,C,D 4

    A,C,E 2

    A,D,E 2

    B,C,D 2

    C,D,E 2

    Item Pairs Number of transactionsA,B,C 3

    A,B,D 2

    A,C,D 4

    A,C,E 2

    A,D,E 2

    B,C,D 2

    C,D,E 2

    {A,B,C} & {A,B,D} A,B,C,D

    {A,C,D} & {A,C,E} A,C,D,E

    Item Pairs Number of transactions

    A,B,C,D 2

    A,C,D,E 2

    According to Apriori principle both {A,B,C,D} and {A,C,D,E} sets are bought together

    frequently.

  • 8/13/2019 Association Rule Mining Example

    5/12

    b) Obtain significant decision rules.Subset of {A,B,C,D}

    {A} , {B} , {C} , {D} , {A,B} , {A,C} , {A,D} , {B,C} , {B,D} , {C,D} , {A,B,C} , {A,B,D} , {A,C,D} , {B,C,D}

    {A} {B,C,D}C = SUP {A,B,C,D} / SUP {A}

    = 2/5

    = 40%

    {B} {A,C,D}C = SUP {A,B,C,D} / SUP {B}

    = 2/3

    = 66.6%

    {C} {A,B,D}C = SUP {A,B,C,D} / SUP {C}

    = 2/5

    = 40%

    {D} {A,B,C}C = SUP {A,B,C,D} / SUP {D}

    = 2/4

    = 50%

    {A,B} {C,D}C = SUP {A,B,C,D} / SUP {A,B}

    = 2/3

    = 66.6%

    {A,C} {B,D}C = SUP {A,B,C,D} / SUP {A,C}

    = 2/5

    = 40%

    {A,D} {B,C}C = SUP {A,B,C,D} / SUP {A,D}

    = 2/4

    = 50%

  • 8/13/2019 Association Rule Mining Example

    6/12

    {B,C} {A,D}C = SUP {A,B,C,D} / SUP {B,C}

    = 2/3

    = 66.6%

    {B,D} {A,C}C = SUP {A,B,C,D} / SUP {B,D}

    = 2/2

    = 100%

    {C,D} {A,B}C = SUP {A,B,C,D} / SUP {C,D}

    = 2/4

    = 50%

    {A,B,C} {D}C = SUP {A,B,C,D} / SUP {A,B,C}

    = 2/3

    = 66.6%

    {A,B,D} {C}C = SUP {A,B,C,D} / SUP {A,B,D}

    = 2/2

    = 100%

    {A,C,D} {B}C = SUP {A,B,C,D} / SUP {A,C,D}

    = 2/4

    = 50%

    {B,C,D} {A}C = SUP {A,B,C,D} / SUP {B,C,D}

    = 2/2

    = 100%

  • 8/13/2019 Association Rule Mining Example

    7/12

    Rule Confidence

    {A} {B,C,D} 40%

    {B} {A,C,D} 66.6%

    {C} {A,B,D} 40%

    {D} {A,B,C} 50%

    {A,B} {C,D} 66.6%

    {A,C} {B,D} 40%

    {A,D} {B,C} 50%

    {B,C} {A,D} 66.6%

    {B,D} {A,C} 100%

    {C,D} {A,B} 50%

    {A,B,C} {D} 66.6%

    {A,B,D} {C} 100%

    {A,C,D} {B} 100%

    {B,C,D} {A} 50%

    Subset of {A,C,D,E}

    {A} , {C} , {D} , {E} , {A,C} , {A,D} , {A,E} , {C,D} , {C,E} , {D,E} , {A,C,D} , {A,C,E} , {A,D,E} , {C,D,E}

    {A} {C,D,E}C = SUP {A,C,D,E} / SUP {A}

    = 2/5

    = 40%

    {C} {A,D,E}C = SUP {A,C,D,E} / SUP {C}

    = 2/5

    = 40%

    {D} {A,C,E}C = SUP {A,C,D,E } / SUP {D}

    = 2/4

    = 50%

    {E} {A,C,D}C = SUP { A,C,D,E } / SUP {E}

    = 2/2

    = 100%

  • 8/13/2019 Association Rule Mining Example

    8/12

    {A,C} {D,E}C = SUP { A,C,D,E } / SUP {A,C}

    = 2/5

    = 40%

    {A,D} {C,E}C = SUP { A,C,D,E } / SUP {A,D}

    = 2/4

    = 50%

    {A,E} {C,D}C = SUP { A,C,D,E } / SUP {A,E}

    = 2/2

    = 100%

    {C,D} {A,E}C = SUP { A,C,D,E } / SUP {C,D}

    = 2/4

    = 50%

    {C,E} {A,D}C = SUP { A,C,D,E } / SUP {C,E}

    = 2/2

    = 100%

    {D,E} {A,C}C = SUP { A,C,D,E } / SUP {D,E}

    = 2/2

    = 100%

    {A,C,D} {E}C = SUP { A,C,D,E } / SUP {A,C,D}

    = 2/4

    = 50%

    {A,C,E} {D}C = SUP { A,C,D,E } / SUP {A,C,E}

    = 2/2

    = 100%

  • 8/13/2019 Association Rule Mining Example

    9/12

    {A,D,E} {C}C = SUP { A,C,D,E } / SUP {A,D,E}

    = 2/2

    = 100%

    {C,D,E} {A}C = SUP { A,C,D,E } / SUP {C,D,E}

    = 2/2

    = 100%

    Rule Confidence

    {A} {C,D,E} 40%

    {C} {A,D,E} 40%

    {D} {A,C,E} 50%

    {E} {A,C,D} 100%

    {A,C} {D,E} 40%

    {A,D} {C,E} 50%

    {A,E} {C,D} 100%

    {C,D} {A,E} 50%

    {C,E} {A,D} 100%

    {D,E} {A,C} 100%

    {A,C,D} {E} 50%

    {A,C,E} {D} 100%{A,D,E} {C} 100%

    {C,D,E} {A} 100%

  • 8/13/2019 Association Rule Mining Example

    10/12

    c) Derive the FP-Tree for the above transaction table.

    Transaction ID Items Bought

    T1 A,B,CT2 A,B,C,D,E

    T3 A,C,D

    T4 A,C,D,E

    T5 A,B,C,D

    Support for each item sets

    A = 5/5 = 100%

    B = 3/5 = 60%

    C = 5/5 = 100%

    D = 4/5 = 80%

    E = 2/5 = 40%

    According to Support

    A,C,D,B,E

    Re-arrange the table

    Transaction ID Items Bought

    T1 A,C,B

    T2 A,C,D.B,E

    T3 A,C,DT4 A,C,D,E

    T5 A,C,D,B

  • 8/13/2019 Association Rule Mining Example

    11/12

    FP-Tree

    After TID T1

    After TID T2

    After TID T3

    A1

    Null

    C1

    B1

    Null

    B1

    A2

    C2

    D1

    B1

    E1

    Null

    B1

    A3

    C3

    D2

    B1

    E1

  • 8/13/2019 Association Rule Mining Example

    12/12

    After TID T4

    After TID T5

    E1

    Null

    B1

    A4

    C4

    D3

    B1

    E1

    E1

    Null

    B1

    A5

    C5

    D4

    B2

    E1