Top Banner
Presented By: Shihab Rahman Dolon Chanpa Department Of Computer Science And Engineering, University of Dhaka FP Growth Algorithm For Mining Frequent Pattern
23

Data mining fp growth

Dec 04, 2014

Download

Technology

Shihab Rahman

A simple graphical presentation of the implementation of FP Growth Algorithm for mining frequent pattern in a database
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data mining fp growth

Presented By:Shihab RahmanDolon Chanpa

Department Of Computer Science And Engineering,

University of Dhaka

FP Growth Algorithm For Mining Frequent Pattern

Page 2: Data mining fp growth

FP Growth Stands for frequent pattern growth

It is a scalable technique for mining frequent pattern in a database

What is FP Growth?

Page 3: Data mining fp growth

FP growth improves Apriority to a big extentFrequent Item set Mining is possible without

candidate generationOnly “two scan” to the database is needed

BUT HOW?

FP Growth

Page 4: Data mining fp growth

Simply a two step procedure– Step 1: Build a compact data structure called

the FP-tree• Built using 2 passes over the data-set.

– Step 2: Extracts frequent item sets directly from the FP-tree

FP Growth

Page 5: Data mining fp growth

Now Lets Consider the following transaction table

FP Growth

TID List of item IDs

T100 I1,I2,I3

T200 I2,I4

T300 I2,I3

T400 I1,I2,I4

T500 I1,I3

T600 I2,I3

T700 I1,I3

T800 I1,I2,I3,I5

T900 I1,I2,I3

Page 6: Data mining fp growth

Now we will build a FP tree of that databaseItem sets are considered in order of their

descending value of support count.

FP Growth

Page 7: Data mining fp growth

null

I2:1

I1:1

I5:1

For Transaction:I2,I1,I5

Page 8: Data mining fp growth

null

I2:2

I1:1

I5:1

I4:1

For Transaction:I2,I4

Page 9: Data mining fp growth

null

I2:3

I1:1

I5:1

I3:1

I4:1

For Transaction:I2,I3

Page 10: Data mining fp growth

null

I2:4

I1:2

I5:1

I3:1

I4:1

I4:1

For Transaction:I2,I1,I4

Page 11: Data mining fp growth

null

I2:4

I1:2 I3:

1I4:1

I4:1

For Transaction:I1,I3

I5:1

I1:1

I3:1

Page 12: Data mining fp growth

null

I2:5

I1:2 I3:

2I4:1

I4:1

For Transaction:I2,I3

I5:1

I1:1

I3:1

Page 13: Data mining fp growth

null

I2:5

I1:2 I3:

2I4:1

I4:1

For Transaction:I1,I3

I5:1

I1:2

I3:2

Page 14: Data mining fp growth

null

I2:6

I1:3

I3:1

I3:2

I5:1

I4:1

I4:1

For Transaction:I2,I1,I3,I5

I5:1

I1:2

I3:2

Page 15: Data mining fp growth

null

I2:7

I1:4

I3:2

I3:2

I5:1

I4:1

I4:1

For Transaction:I2,I1,I3

I1:2

I3:2

I5:1

Almost Over!

Page 16: Data mining fp growth

I2 7

I1 6

I3 6

I4 2

I5 2

null

I2:7

I1:4

I3:2

I3:2

I5:1

I4:1

I4:1

To facilitate tree traversal, an item header table is built so that each item points to itsoccurrences in the tree via a chain of node-links.

I1:2

I3:2

I5:1

FP Tree Construction Over!!Now we need to find conditional pattern base and Conditional FP Tree for each item

Page 17: Data mining fp growth

null

I2:7

I1:4

I3:2

I3:2

I5:1

I4:1

I4:1

I1:2

I3:2

I5:1

Conditional Pattern Base

I5 {{I2,I1:1},{I2,I1,I3:1}}

Conditional FP Tree for I5:{I2:2,I1:2}

Page 18: Data mining fp growth

null

I2:7

I1:4

I3:2

I3:2

I5:1

I4:1

I4:1

I1:2

I3:2

I5:1

Conditional Pattern Base

I4 {{I2,I1:1},{I2:1}}

Conditional FP Tree for I4:{I2:2}

Page 19: Data mining fp growth

null

I2:7

I14

I3:2

I3:2

I5:1

I4:1

I4:1

I1:2

I3:2

I5:1

Conditional Pattern Base

I3 {{I2,I1:2},{I2:2},{I1:2}}

Conditional FP Tree for I3:{I2:4,I1:2},{I1:2}

Page 20: Data mining fp growth

null

I2:7

I1:4

I3:2

I3:2

I5:1

I4:1

I4:1

I1:2

I3:2

I5:1

Conditional Pattern Base

I1 {{I2:4}}

Conditional FP Tree for I1:{I2:4}

Page 21: Data mining fp growth

Frequent Patters Generated

Frequent Pattern Generated

I5 {I2, I5: 2}, {I1, I5: 2}, {I2, I1, I5: 2}

I4 {I2, I4: 2}

I3 {I2, I3: 4}, {I1, I3: 4}, {I2, I1, I3: 2}

I1 {I2, I1: 4}

Page 22: Data mining fp growth

Advantages of FP-Growthonly 2 passes over data-set“compresses” data-setno candidate generationmuch faster than Apriori

Disadvantages of FP-GrowthFP-Tree may not fit in memory!!FP-Tree is expensive to build

Discussion

0

10

20

30

40

50

60

70

80

90

100

0 0.5 1 1.5 2 2.5 3

Support threshold(%)

Ru

n t

ime

(se

c.)

D1 FP-grow th runtime

D1 Apriori runtime

Page 23: Data mining fp growth

Thank You