Top Banner
The Keogh Lab 1 Data Mining and Structure Retrieval Presented by Abdullah Mueen
8

The Keogh Lab

Feb 24, 2016

Download

Documents

taite

Data Mining and Structure Retrieval. Presented by Abdullah Mueen. The Keogh Lab. Overview of our work. Our Goal: Extract information from raw, noisy, massive, unstructured data. We develop algorithms for Classification Clustering Rule finding Motif discovery Discord discovery - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Keogh  Lab

1

The Keogh Lab

Data Mining and Structure Retrieval

Presented byAbdullah Mueen

Page 2: The Keogh  Lab

2

Overview of our work• Our Goal: Extract information from raw, noisy, massive,

unstructured data.• We develop algorithms for

– Classification– Clustering– Rule finding– Motif discovery– Discord discovery– Shapelet discovery– Linkage discovery

• We work closely with the domain experts. – For collecting new data.– To verify our results.

Page 3: The Keogh  Lab

3

Case 1: Motif DiscoveryBeet Leafhopper (Circulifer tenellus)

plant membrane

Stylet

voltage source

input resistor

V

0 50 100 150 2000

10

20

to insectconductive glue

voltage reading

to soil near plant

Exact Discovery of Time Series Motifs.Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney

Cash,  Brandon Westover. SDM 2009.

MK motif discovery

Page 4: The Keogh  Lab

4

false nettles

stinging nettles

Case 2: Shapelet Discovery

false nettles

Shapelet

stinging nettles

Time Series Shapelets: A New Primitive for Data Mining.

Lexiang Ye and Eamonn Keogh. SIGKDD 2009 

Page 5: The Keogh  Lab

5

Case 3: Linkage Discovery

CK-1

0.6291

CK-1

0.9033

CK-1 Distance Measure

0.6

0.7

0.8

0.9

CK-1

Dist

ance Single Linkage Dendrogram

Print House 1 Print House 2

A Compression Based Distance Measure for Texture. Bilson Campana and Eamonn Keogh . SDM 2010

text

a hand-press bookcharacter matrix

textornaments text

Page 6: The Keogh  Lab

Lab Members

Dr. Eamonn KeoghDr. Gustavo Batista

Abdullah MueenQiang Zhu

Bilson CampanaThanawin Art R.

Bing HuYuan Hao

Jesin Zakaria6

Page 7: The Keogh  Lab

7

Motif in Online Data • Maintain motif in streaming data without

introducing latency.

Page 8: The Keogh  Lab

8

Motion Motif• Find repeated motion in motion capture data

which is a 32 dimensional time series.