Top Banner
Continuous Data Stream Continuous Data Stream Processing Processing Music Virtual Channel – extensions Data Stream Monitoring – tree pattern mining Continuous Query Processing – sequence queries Date: 2005/10/21 Post-Excellence Project Post-Excellence Project Subproject 6 Subproject 6
21

Continuous Data Stream Processing Music Virtual Channel – extensions Data Stream Monitoring – tree pattern mining Continuous Query Processing – sequence.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Continuous Data Stream ProcessingProcessing

Music Virtual Channel – extensionsData Stream Monitoring – tree pattern miningContinuous Query Processing – sequence queries

Date: 2005/10/21Post-Excellence ProjectPost-Excellence ProjectSubproject 6Subproject 6

Page 2: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

22

Clusteringengine

Clusteringengine

Music metadata

Music metadata

Music Virtual Channel Music Virtual Channel Extensions Extensions

…11

NN

22

Music collections

Internet V.C.player

V.C.player Filtering

engineFilteringengine

Music channel simulat

or

Music channel simulat

or

InterfaceInterface

ProfilemonitorProfile

monitorClustermonitorClustermonitor

ChannelmonitorChannelmonitor

FavoritechannelFavoritechannel

Clustercoordinator

Clustercoordinator

Peer searchengine

Peer searchengine

Profiledatabase

Profiledatabase

MusicXML

database

MusicXML

database

XML Filteringengine

XML Filteringengine

Page 3: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

33

An Extension on Virtual ChannelAn Extension on Virtual Channel

After a player starts a rangerange (or kNNkNN) search, It updates its profile periodically The search results are continuously maintained

V.C. player(query)

0%

10%

20%

30%

40%

50%

POP BLUE ROCK LATIN JAZZ DANCE0%

10%

20%

30%

40%

50%

POP BLUE ROCK LATIN JAZZ DANCE

V.C. player(peer)

0%

10%

20%

30%

40%

50%

POP BLUE ROCK LATIN JAZZ DANCE0%

10%

20%

30%

40%

50%

POP BLUE ROCK LATIN JAZZ DANCE0%

10%

20%

30%

40%

50%

POP BLUE ROCK LATIN JAZZ DANCE

Page 4: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

44

An Extension on Virtual ChannelAn Extension on Virtual Channel

Compared with the clustering engine A flexible definition of “clusters” Update is more natural than insertion/deletion No need of parameter setting and re-clustering Indexing can relieve the pain of frequent update

Compared with the problem of moving objects Movements in a high-dimensional feature space In most cases every object is also a query Prediction of object movement is possible

Page 5: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

55

When a music piece is played on a channel, The corresponding musicXML file can be obtained A query can be a portion of musicXML or XQuery

An Extension on Favorite ChannelAn Extension on Favorite Channel

Page 6: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

66

An Extension on Favorite ChannelAn Extension on Favorite Channel

Compared with query segments More musical semantic in a query Do not interfere the music playback Matching on complex tree-structures

• Common subquery is still useful

Page 7: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

77

Research IssuesResearch Issues

Peer Search Engine An indexing method to support continuous query An indexing method to support continuous query

processing for high-dimensional moving objectsprocessing for high-dimensional moving objects A prediction-based bounding mechanism to reduce

the frequency of profile updateXML Filtering Engine

An online method to enable tree pattern mining An online method to enable tree pattern mining over a data streamover a data stream

An indexing mechanism to support XML filtering

Page 8: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Discovering Frequent Tree Discovering Frequent Tree Patterns over Data StreamsPatterns over Data Streams

Submitted for publication

Page 9: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

99

Problem DefinitionProblem Definition

As the query trees stream in, find out the subtrees which occur more then θ·N times, where N is the number of trees received so far and 0≦θ 1≦

STMerSTMer

Frequent Tree Patterns

T1 T3 T2

Page 10: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1010

Problem Definition (Cont.)Problem Definition (Cont.)

Labeled ordered treeInduced subtree

B

D C

differs fromB

C D

A

B E

C D

Tree pattern Query Tree

Page 11: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1111

An ExampleAn Example

Given θ = 0.6

Frequent Tree Patterns (occurrence > 0.6*1) :

STMerSTMer

A

B C

A

B CA B C

A

B

A

C

Frequent Tree Patterns (occurrence > 0.6*2) :

B

B

D E

Frequent Tree Patterns (occurrence > 0.6*3) :

A BA

B

A

B F

Page 12: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1212

Main DifficultiesMain Difficulties

The properties of data streams: One pass Traditional tree mining methods fail Fast input rate Efficiency issue is critical Incremental An incremental algorithm is

required Unbounded Approximate counting is needed

Page 13: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1313

An Overview of Our MethodAn Overview of Our Method

Subtree generation

Subtree maintenance

STMerSTMerT1

A candidate pool

Requests on demand

Page 14: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1414

String RepresentationString Representation

DFS order on T (label, level) node sequence S

Page 15: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1515

Subtree GenerationSubtree Generation

Data stream

Buffer A1

A

TD

A1

A

t1

A,1

Buffer A1B2

A

B

TD

B1

B

A

B

A1B2

t2

B,2

Page 16: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1616

Subtree Generation (Cont.)Subtree Generation (Cont.)

Data stream

t1t2

B1

B

A

B

A1B2A1

A

B,2

Buffer A1B2C2

TD

A

B CC1

CA

C

A1C2

A

B C

A1B2C2

A,1C,2

t3

Page 17: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1717

Subtree Generation (Cont.)Subtree Generation (Cont.)

A1 B1

B2

ΦAPT

C1

D2

D1

E3

E2

E1

C2

D3

E4

C2

D3

E4

Buffer A1B2    

TD

A

B C

D

E

F2C2 D3 E4

Page 18: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1818

Subtree MaintenanceSubtree Maintenance

Buffer A1B2E2

(E2, 1, 3)

APT

A1 B1 E1

B2 E2

E2

Φ

GPT

+1

#query trees received = 321

(A1, 5, 0)

(B2, 4, 1)

Φ

(C3, 2, 1)

+1

+1

Page 19: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

1919

Experiments on SensitivityExperiments on Sensitivity

Minimum support Error parameter

Page 20: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

2020

Experiments on ComparisonExperiments on Comparison

StreamT (ICDM’02)

Page 21: Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.

Continuous Data Stream Management

2121

ConclusionConclusion

Contribution A novel technique is proposed for efficient

subtree generation A compact structure is employed to reduce the

the memory requirement of the candidate poolCurrent work

Mining closed frequent subtrees over data streams A

B C

2

A

B5

A

C2

A

5