Top Banner
Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C. Faloutsos and notes from Anne Mascarin)
38

Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Dec 17, 2015

Download

Documents

Megan Charles
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Spatial and Temporal Data Mining

V. Megalooikonomou

Preliminaries

(some slides are based on notes from “Searching multimedia databases by content” by C. Faloutsos and notes from Anne Mascarin)

Page 2: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

General Overview

Fourier analysis Discrete Cosine Transform (DCT) Wavelets Karhunen-Loeve Singular Value Decomposition

Page 3: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Analysis Fourier’s Theorem:

Every continuous function can be considered as a sum of sinusoidal functions

Discrete case – n-point Discrete Fourier Transform of a signal is defined to be a sequence of n complex numbers given by

where j is the imaginary unit ( ) We denote a DFT pair as

1,,0],[ nixx i

X

1,,0, nfX f

1,,1,0)/2exp(/11

0

nfnfijxnX

n

i if

1jXx

Page 4: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Analysis

The signal can be recovered by the inverse transform:

is a complex number with the exception of

which is real if the signal is real

x

1,,1,0)/2exp(/11

0

ninfijXnx

n

f fi

fX

0X x

Page 5: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Analysis

Page 6: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Analysis Main Idea of DFT: decompose a signal into

sine and cosine functions of several frequencies, multiples of the basic frequency 1/n

DFT as a matrix operation: where is an n x n matrix with

xX

][ , fia

1,,0,)/2exp(/1, nfinfijnfi

Page 7: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Analysis The matrix A is column-orthonormal, i.e., its column

vectors are unit vectors, mutually orthogonal (also row-orthonormal since it is a square matrix)

where I is the (n x n) identity matrix and A* is the

conjugate-transpose (‘hermitian’) of A that is DFT corresponds to a matrix multiplication with A and since

A is orthonormal the matrix A performs a rotation (no scaling) of the vector x in n-d complex space. As a rotation, it does not affect the length of the original vector nor the Euclidean distance between any pair of points.

**

][ *,

*ifa

Page 8: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Properties of DFT Parseval Theorem: Let be the Discrete Fourier Transform

of the sequence . Then we have

The DFT also preserves the Euclidean distance (proof?)

Any transformation that corresponds to an orthonormal matrix A also enjoys a theorem similar to Parseval’s theorem for the DFT. Examples: DCT, DWT

X

x

1

0

21

0

2n

ff

n

ii Xx

Page 9: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Properties of DFT A shift in the time domain changes only the phase of the

DFT coefficients, but not the amplitude For real signal we have

so we only need to plot the amplitudes up to the middle, q, if n=2q+1 or q+1 if the duration is n=2q

The resulting plot of |Xf| vs f is called the amplitude spectrum (or spectrum) of the given time sequence; its square is the energy spectrum (or power spectrum)

The DFT requires O(nlogn) computation time. Straightforward computation requires O(n2), however, FFT exploits regularities of the function achieving O(nlogn)

)]/2exp([][ njfiXx ofii o

1,2,1* nfforXX fnf

nfije /2

Page 10: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Examples

Page 11: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Discrete Cosine Transform (DCT)

Objective: to concentrate the energy into a few coefficients as possible

DFT is helpful to highlight periodicities in the signal through its amplitude spectrum

When successive values are correlated DCT is better than DFT

DCT avoids the ‘frequency leak’ that DFT has when the signal has a ‘trend’

DCT’s coefficients are always real (as opposed to complex)

DCT reflects the original sequence in the time axis around the last point and takes DFT on the twice-as-long (symmetric) sequence -> all the coefficients are reals, their amplitute is symmetric along the middle (Xf=X2n-f), thus only the first n need to be kept

Page 12: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Discrete Cosine Transform (DCT)

The formulas for DCT:

For the inverse DCT:

The complexity of DCT is also O(nlogn)

1,,0)5.0(

cos/11

0

nfn

ifxnX

n

iif

1,,0)5.0(

cos/2/11

1

nin

ifXnXnx

n

ffoi

Page 13: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

m-Dimensional DFT/DCT (JPEG)

m=2, gray scale images m=3, MRI brain volumes We do the transformation along each dimension

(DFT on each row, then DFT on each column) For a n1 x n2 array

where is the value of the position (i1,i2) of the array and f1, f2 are the spatial frequencies ranging from 0 to (n1-1) and (n2-1)

The 2-d DCT is used in the JPEG standard for image and video compression

][21 ,iix

)/2exp()/2exp(11

2220 0

111,

21

,

1

1

2

2

2121nfjinfjix

nnX

n

i

n

iiiff

21 ,iix

Page 14: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelets

It is believed that it avoids the ‘frequency leak’ problem of DFTeven better than DCT

Short Window Fourier Transform (SWFT): restricted frequency leak

In the time domain each values gives full information about that instant (no info about f)

DFT’s coefficients give full info about a given f but it needs all frequencies to recover the value at a given instant in time

SWFT is in between SWFT: how to choose the width w of the window? Discrete Wavelet Transform: let w be variable

Page 15: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Continuous Wavelet transform

Position

all time

Coefficient Scale

for each Scale

for each Position

Coefficient (S,P) = Signal x Wavelet (S,P)

end

end

Page 16: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Fourier Loses time (location) coordinate completely Analyses the whole signal Short pieces lose “frequency” meaning

Wavelets Localized time-frequency analysis Short signal pieces also have significance Scale = Frequency band

Fourier versus Wavelets

Page 17: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelets Defined

“The wavelet transform is a tool that cuts up data, functions or operators into different frequency components, and then studies each component with a resolution matched to its scale”

Dr. Ingrid Daubechies, Lucent, Princeton U

Page 18: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet Transform

Scale and shift original waveform

Compare to a wavelet

Assign a coefficient of similarity

Page 19: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Some wavelets – different shapes, different properties

Db3

Mexican hat Gauss

Page 20: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Continuous Wavelet transform:shift wavelet and compare, …

C = 0.0004

C = 0.0034

Page 21: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

…then scale, and shift through positions

Page 22: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Scaling/stretching wavelet

Same wavelet, different scales

Page 23: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet transform: Scaling – value of “stretch”

f(t) = sin(t)

scale factor1

f(t) = sin(2t)scale factor 2

f(t) = sin(3t)scale factor 3

Page 24: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

More on scaling It lets you either narrow down the

frequency band of interest, or determine the frequency content in a narrower time interval

Scaling = frequency band

Good for non-stationary data

Page 25: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Scale is (sort of) like frequency

Small scale-Rapidly changing details, -Like high frequency

Large scale-Slowly changing details-Like low frequency

Page 26: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Discrete Wavelet Transform

“Subset” of scale and position based on power of two rather than every “possible” set of scale and

position in continuous wavelet transform

Behaves like a filter bank: signal in, coefficients out

Down-sampling necessary (twice as much data as original signal)

Page 27: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Discrete Wavelet transform

signal

filters

Approximation(a)

Details(d)

lowpass highpass

Page 28: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Results of wavelet transform: approximation and details

Low frequency: approximation (a)

High frequency Details (d)

“Decomposition” can be performed iteratively

Page 29: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Levels of decomposition Successively decompose

the approximation

Level 5 decomposition = a5 + d5 + d4 + d3 + d2

+ d1

No limit to the number of decompositions performed

Page 30: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet synthesis

•Re-creates signal from coefficients•Up-sampling required

Page 31: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Multi-level Wavelet Analysis

Multi-level wavelet decomposition tree

Reassembling original signal

Page 32: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

The Wavelet Toolbox (Matlab)

The Wavelet Toolbox contains graphical tools and command-line functions for analysis, synthesis, de-noising, and compression of signals and images. These tools work particularly well in “non-stationary data”

These tools are used for de-noising, compression, feature extraction, enhancement, pattern recognition in MANY types of applications and industries

Page 33: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Applications of wavelets Pattern recognition

Biotech: to distinguish the normal from the pathological membranes

Biometrics: facial/corneal/fingerprint recognition Feature extraction

Metallurgy: characterization of rough surfaces Trend detection:

Finance: exploring variation of stock prices Perfect reconstruction

Communications: wireless channel signals Video compression – JPEG 2000

Page 34: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet de-noising•Thresholding for “zeroing” some detail coefficients

Page 35: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet de-noising

Page 36: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

A demo

Page 37: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelet Toolbox – Example

Page 38: Spatial and Temporal Data Mining V. Megalooikonomou Preliminaries (some slides are based on notes from “Searching multimedia databases by content” by C.

Wavelets: more information

References Wavelets and Filter Banks by Gilbert

Strang and Truong Nguyen A Friendly Guide to Wavelets by Gerald

Kaiser Web Resources

Wavelet Digest http://www.wavelet.org/ Amara’s Wavelet Page

http://www.amara.com/current/wavelet.html