Top Banner
CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos
77

CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

Dec 24, 2015

Download

Documents

Jerome Perry
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826: Multimedia Databases and Data Mining

Lecture #8: Fractals - introduction

C. Faloutsos

Page 2: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 2

Must-read Material

• Christos Faloutsos and Ibrahim Kamel, Beyond Uniformity and Independence: Analysis of R-trees Using the Concept of Fractal Dimension, Proc. ACM SIGACT-SIGMOD-SIGART PODS, May 1994, pp. 4-13, Minneapolis, MN.

Page 3: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 3

Recommended Material

optional, but very useful:

• Manfred Schroeder Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W.H. Freeman and Company, 1991– Chapter 10: boxcounting method– Chapter 1: Sierpinski triangle

Page 4: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 4

Outline

Goal: ‘Find similar / interesting things’

• Intro to DB

• Indexing - similarity search

• Data Mining

Page 5: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 5

Indexing - Detailed outline• primary key indexing• secondary key / multi-key indexing• spatial access methods

– z-ordering– R-trees– misc

• fractals– intro– applications

• text

Page 6: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 6

Intro to fractals - outline

• Motivation – 3 problems / case studies

• Definition of fractals and power laws

• Solutions to posed problems

• More examples and tools

• Discussion - putting fractals to work!

• Conclusions – practitioner’s guide

• Appendix: gory details - boxcounting plots

Page 7: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 7

Road end-points of Montgomery county:

•Q1: how many d.a. for an R-tree?

•Q2 : distribution?

•not uniform

•not Gaussian

•no rules??

Problem #1: GIS - points

Page 8: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 8

Problem #2 - spatial d.m.

Galaxies (Sloan Digital Sky Survey w/ B. Nichol)

- ‘spiral’ and ‘elliptical’ galaxies

(stores and households ...)

- patterns?

- attraction/repulsion?

- how many ‘spi’ within r from an ‘ell’?

Page 9: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 9

Problem #3: traffic

• disk trace (from HP - J. Wilkes); Web traffic - fit a model

• how many explosions to expect?

• queue length distr.?time

# bytes

Page 10: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 10

Problem #3: traffic

time

# bytes

Poisson indep., ident. distr

Page 11: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 11

Problem #3: traffic

time

# bytes

Poisson indep., ident. distr

Page 12: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 12

Problem #3: traffic

time

# bytes

Poisson indep., ident. distr

Q: Then, how to generatesuch bursty traffic?

Page 13: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 13

Common answer:

• Fractals / self-similarities / power laws

• Seminal works from Hilbert, Minkowski, Cantor, Mandelbrot, (Hausdorff, Lyapunov, Ken Wilson, …)

Page 14: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 14

Road map

• Motivation – 3 problems / case studies

• Definition of fractals and power laws

• Solutions to posed problems

• More examples and tools

• Discussion - putting fractals to work!

• Conclusions – practitioner’s guide

• Appendix: gory details - boxcounting plots

Page 15: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 15

What is a fractal?

= self-similar point set, e.g., Sierpinski triangle:

...zero area;

infinite length!

Dimensionality??

Page 16: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 16

Definitions (cont’d)

• Paradox: Infinite perimeter ; Zero area!

• ‘dimensionality’: between 1 and 2

• actually: Log(3)/Log(2) = 1.58...

Page 17: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 17

Dfn of fd:

ONLY for a perfectly self-similar point set:

=log(n)/log(f) = log(3)/log(2) = 1.58

...zero area;

infinite length!

Page 18: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 18

Intrinsic (‘fractal’) dimension

• Q: fractal dimension of a line?

• A: 1 (= log(2)/log(2)!)

Page 19: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 19

Intrinsic (‘fractal’) dimension

• Q: fractal dimension of a line?

• A: 1 (= log(2)/log(2)!)

Page 20: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 20

Intrinsic (‘fractal’) dimension

• Q: dfn for a given set of points?

42

33

24

15

yx

Page 21: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 21

Intrinsic (‘fractal’) dimension

• Q: fractal dimension of a line?

• A: nn ( <= r ) ~ r^1(‘power law’: y=x^a)

• Q: fd of a plane?• A: nn ( <= r ) ~ r^2fd== slope of (log(nn) vs

log(r) )

Page 22: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014)

Intrinsic (‘fractal’) dimension

• Local fractal dimension of point ‘P’?

• A: nnP ( <= r ) ~ r^1

• If this equation holds for several values of r,

• Then, the local fractal dimension of point P:

• Local fd = exp = 1

P

EXPLANATIONS

22

Page 23: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014)

Intrinsic (‘fractal’) dimension

• Local fractal dimension of point ‘A’?

• A: nnP ( <= r ) ~ r^1

• If this is true for all points of the cloud

• Then the exponent is the global f.d.

• Or simply the f.d.

P

EXPLANATIONS

23

Page 24: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014)

Intrinsic (‘fractal’) dimension

• Global fractal dimension?• A: if • sumall_P [ nnP ( <= r ) ] ~ r^1• Then: exp = global f.d.

• If this is true for all points of the cloud

• Then the exponent is the global f.d.

• Or simply the f.d.

A

EXPLANATIONS

24

Page 25: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 25

Intrinsic (‘fractal’) dimension

• Algorithm, to estimate it?Notice• Sumall_P [ nnP (<=r) ] is exactly tot#pairs(<=r)

including ‘mirror’ pairs

EXPLANATIONS

Page 26: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 26

Sierpinsky triangle

log( r )

log(#pairs within <=r )

1.58

== ‘correlation integral’

Page 27: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 27

Observations:

• Euclidean objects have integer fractal dimensions – point: 0– lines and smooth curves: 1– smooth surfaces: 2

• fractal dimension -> roughness of the periphery

Page 28: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 28

Important properties

• fd = embedding dimension -> uniform pointset

• a point set may have several fd, depending on scale

Page 29: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 29

Important properties

• fd = embedding dimension -> uniform pointset

• a point set may have several fd, depending on scale

2-d

Page 30: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 30

Important properties

• fd = embedding dimension -> uniform pointset

• a point set may have several fd, depending on scale

1-d

Page 31: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 31

Important properties

0-d

Page 32: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 32

Road map

• Motivation – 3 problems / case studies

• Definition of fractals and power laws

• Solutions to posed problems

• More examples and tools

• Discussion - putting fractals to work!

• Conclusions – practitioner’s guide

• Appendix: gory details - boxcounting plots

Page 33: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 33

Cross-roads of Montgomery county:

•any rules?

Problem #1: GIS points

Page 34: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 34

Solution #1

A: self-similarity ->• <=> fractals • <=> scale-free• <=> power-laws

(y=x^a, F=C*r^(-2))• avg#neighbors(<= r )

= r^D

log( r )

log(#pairs(within <= r))

1.51

Page 35: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 35

Solution #1

A: self-similarity• avg#neighbors(<= r )

~ r^(1.51)

log( r )

log(#pairs(within <= r))

1.51

Page 36: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 36

Examples:MG county

• Montgomery County of MD (road end-points)

Page 37: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 37

Examples:LB county

• Long Beach county of CA (road end-points)

Page 38: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 38

Solution#2: spatial d.m.Galaxies ( ‘BOPS’ plot - [sigmod2000])

log(#pairs)

log(r)

Page 39: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 39

Solution#2: spatial d.m.

log(r)

log(#pairs within <=r )

spi-spi

spi-ell

ell-ell

- 1.8 slope

- plateau!

- repulsion!

Page 40: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 40

Spatial d.m.

log(r)

log(#pairs within <=r )

spi-spi

spi-ell

ell-ell

- 1.8 slope

- plateau!

- repulsion!

Page 41: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 41

Spatial d.m.

r1r2

r1

r2

Heuristic on choosing # of clusters

Page 42: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 42

Spatial d.m.

log(r)

log(#pairs within <=r )

spi-spi

spi-ell

ell-ell

- 1.8 slope

- plateau!

- repulsion!

Page 43: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 43

Spatial d.m.

log(r)

log(#pairs within <=r )

spi-spi

spi-ell

ell-ell

- 1.8 slope

- plateau!

-repulsion!!

-duplicates

Page 44: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 44

Solution #3: traffic

• disk traces: self-similar:

time

#bytes

Page 45: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 45

Solution #3: traffic

• disk traces (80-20 ‘law’ = ‘multifractal’)

time

#bytes

20% 80%

Page 46: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 46

80-20 / multifractals20 80

Page 47: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 47

80-20 / multifractals20

• p ; (1-p) in general

• yes, there are dependencies

80

Page 48: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 48

More on 80/20: PQRS

• Part of ‘self-* storage’ project [Wang+’02]

time

cylinder#

Page 49: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 49

More on 80/20: PQRS

• Part of ‘self-* storage’ project [Wang+’02]

p q

r s

q

r s

Page 50: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 50

Solution#3: traffic

Clarification:

• fractal: a set of points that is self-similar

• multifractal: a probability density function that is self-similar

Many other time-sequences are bursty/clustered: (such as?)

Page 51: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 51

Example:

• network traffic

http://repository.cs.vt.edu/lbl-conn-7.tar.Z

Page 52: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 52

Web traffic

• [Crovella Bestavros, SIGMETRICS’96]

1000 sec; 100sec10sec; 1sec

Page 53: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 53

Tape accesses

time

Tape#1 Tape# N

# tapes needed, to retrieve n records?

(# days down, due to failures / hurricanes / communication noise...)

Page 54: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 54

Tape accesses

time

Tape#1 Tape# N

# tapes retrieved

# qual. records

50-50 = Poisson

real

Page 55: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 55

Road map

• Motivation – 3 problems / case studies

• Definition of fractals and power laws

• Solutions to posed problems

• More tools and examples

• Discussion - putting fractals to work!

• Conclusions – practitioner’s guide

• Appendix: gory details - boxcounting plots

Page 56: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 56

A counter-intuitive example

• avg degree is, say 3.3• pick a node at random

– guess its degree, exactly (-> “mode”)

degree

count

avg: 3.3

?

Page 57: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 57

A counter-intuitive example

• avg degree is, say 3.3• pick a node at random

– guess its degree, exactly (-> “mode”)

• A: 1!!

degree

count

avg: 3.3

Page 58: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 58

A counter-intuitive example

• avg degree is, say 3.3• pick a node at random

- what is the degree you expect it to have?

• A: 1!!• A’: very skewed distr.• Corollary: the mean is

meaningless!• (and std -> infinity (!))

degree

count

avg: 3.3

Page 59: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 59

Rank exponent R• Power law in the degree distribution

[SIGCOMM99]

internet domains

log(rank)

log(degree)

-0.82

att.com

ibm.com

Page 60: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 60

More tools

• Zipf’s law

• Korcak’s law / “fat fractals”

Page 61: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 61

A famous power law: Zipf’s law

• Q: vocabulary word frequency in a document - any pattern?

aaron zoo

freq.

Page 62: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 62

A famous power law: Zipf’s law

• Bible - rank vs frequency (log-log)

log(rank)

log(freq)

“a”

“the”

Page 63: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 63

A famous power law: Zipf’s law

• Bible - rank vs frequency (log-log)

• similarly, in many other languages; for customers and sales volume; city populations etc etclog(rank)

log(freq)

Page 64: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 64

A famous power law: Zipf’s law

•Zipf distr:

freq = 1/ rank

•generalized Zipf:

freq = 1 / (rank)^a

log(rank)

log(freq)

Page 65: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 65

Olympic medals (Sydney):

y = -0.9676x + 2.3054

R2 = 0.9458

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2

Series1

Linear (Series1)

rank

log(#medals)

Page 66: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 66

Olympic medals (Sydney’00, Athens’04):

log( rank)

log(#medals)

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2

athens

sidney

Page 67: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 67

TELCO data

# of service units

count ofcustomers

‘best customer’

Page 68: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 68

SALES data – store#96

# units sold

count of products

“aspirin”

Page 69: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 69

More power laws: areas – Korcak’s law

Scandinavian lakes

Any pattern?

Page 70: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 70

More power laws: areas – Korcak’s law

Scandinavian lakes area vs complementary cumulative count (log-log axes)

log(count( >= area))

log(area)

Page 71: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 71

More power laws: Korcak

Japan islands;

area vs cumulative count (log-log axes) log(area)

log(count( >= area))

Page 72: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 72

(Korcak’s law: Aegean islands)

Page 73: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 73

Korcak’s law & “fat fractals”

How to generate such regions?

Page 74: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 74

Korcak’s law & “fat fractals”Q: How to generate such regions?A: recursively, from a single region

...

Page 75: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 75

so far we’ve seen:

• concepts:– fractals, multifractals and fat fractals

• tools:– correlation integral (= pair-count plot)– rank/frequency plot (Zipf’s law)– CCDF (Korcak’s law)

Page 76: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

15-826 Copyright: C. Faloutsos (2014) 76

so far we’ve seen:

• concepts:– fractals, multifractals and fat fractals

• tools:– correlation integral (= pair-count plot)– rank/frequency plot (Zipf’s law)– CCDF (Korcak’s law)

sameinfo

Page 77: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.

CMU SCS

Next:

• More examples / applications

• Practitioner’s guide

• Box-counting: fast estimation of correlation integral

15-826 Copyright: C. Faloutsos (2014) 77