Continuous Query Language: From CQL to CAPE Algebra
Plans
Lee ChuChe Wai Kwan
MQP 2004/2005
2
Continuous Query Processing
• Emerging Applications:– Traffic management– Network monitoring
• Require:– Online processing of data streams
• But: – Traditional databases handle persistent data
3
Databases Systems VS Data Stream System
• Database System– One time query
– Random access
• Data Stream System– Continuous queries
– Sequential access
4
CAPE: Constraint-exploiting Adaptive Processing Engine
• An on-going project at WPI
5
CAPE’s limitation
• Desire:– High-level query language, such as SQL
• Instead:– Enter queries as low-level execution plan
• Problems:– Tedious to enter– Error prone
6
Algebra Plan VS SQL
select S.A from R, S, Qwhere R.A = S.A
<queryplan><operator root = “true” id = “1” className = “ …”>
<classVariables><variable name=“group_pos” value=“0”/><variable name=“function” value=“null”/><variable name=“function_pos” value=“0”/><variable name=“function” value=“count”/><variable name=“function_pos” value=“0”/><variable name=“propagate” value=“false”/><variable name=“debug” value=“true”/>
</classVariables><properties></properties><parents> </parents><children>
<child id = “2”/></children><streams> </streams>
</operator>
<operator root…..>...</operator>...
</queryplan>
ID = 1Group By
ID = 2
7
Objective
• Define and implement a high-level query language for CAPE
8
Methodology
• Study existing Continuous Processing Language proposals
• Identify one, adopt and adapt if appropriate
• Implement it for CAPE
9
Requirements on Language
• SQL-alike
• Data Streams
• Windows on streams
10
Continuous Processing Languages
• UDA – UCLA
• TelegraphCQ – Berkeley
• STREAM-CQL – Stanford
11
STREAM-CQL
• Well defined semantics
• Open source available
• Query example:query : rstream (select S.A from R, Q,
S[range 1 minute] where R.A = S.A);
12
Our Query Plan Generator: Big Picture
CQL STREAM Plan
Generator
STREAM Plan
Generator
CAPEEngine
STREAM Parser
STREAM Parser
CAPE Plan Rewriter
CAPE Plan Rewriter
CAPE XML Plan Writer
CAPE XML Plan Writer
13
Step 1 :STREAM Parser
CQL
Generates a parse tree
Generates a parse tree
STREAM Parser
STREAM Parser
STREAM Plan
Generator
STREAM Plan
Generator
Yacc and Lex
14
Step 2: STREAM Plan Generator
STREAM Plan
Generator
STREAM Plan
Generator
Modified Plan
t_rstreamNowt_removeIstreamt_streamCross
t_removeProjectt_makeCrossBinary
t_makeStreamCrossBinaryt_pushSelect
t_rstreamNowt_removeIstreamt_streamCross
t_removeProjectt_makeCrossBinary
t_makeStreamCrossBinaryt_pushSelect
CAPE Plan Rewriter
CAPE Plan RewriterParse Tree
15
STREAM Plan Generator :Default Query Plan
query : rstream
(select S.A from R,
S [range 1 minute], Q,
where R.A = S.A);
RStreamID = 7
Project [1, 0]ID = 6
Cross (1, 3, 4)ID = 0
Select[0,0]==[1,0]ID = 5
Stream Source[1]ID = 2
Range Window[60]ID = 3
Stream Source[2]ID = 4Stream Source[0]
ID = 1
16
STREAM Plan Generator:Cleaned Query Plan
query : rstream
(select S.A from R,
S [range 1 minute], Q,
where R.A = S.A);
RStreamID = 7
Project [1, 0]ID = 6
Cross (10, 4)ID = 9
Select[0,0]==[1,1]ID = 10
Cross (1, 3)ID = 8
Stream Source[0]ID = 1
Range Window[60]ID = 3
Stream Source[1]ID = 2
Stream Source[2]ID = 4
17
Step 3: CAPE Plan Rewriter
Optimized Tree
ThetaJoin rule
WindowPushUp rule
ThetaJoin rule
WindowPushUp rule
CAPE Plan Rewriter
CAPE Plan RewriterCleaned Tree
18
ThetaJoin RuleRStream
ID = 7
Project [1, 0]ID = 6
Cross (10, 4)ID = 9
Select[0,0]==[1,1]ID = 10
Cross (1, 3)ID = 8
ThetaJoin[0,0]==[1,1]ID = 11
Cross (11, 4)ID = 9
Stream Source[2]ID = 4
Stream Source[1]ID = 2
Stream Source[0]ID = 1
Range Window[60]ID = 3
19
Project [1, 0]ID = 6
Cross (11, 4)ID = 9
Range Window [60]
ThetaJoin[0,0]==[1,1]ID = 11
Range Window[60]
Stream Source[0]ID = 1 Stream Source[1]
ID = 2
ThetaJoin[0,0]==[1,1]ID = 11
Cross (11, 4)ID = 9
Stream Source[1]ID = 2
Range Window[60]ID = 3
WindowPushUp RuleRStream
ID = 7
Project [1, 0]ID = 6
Stream Source[0]ID = 1
Stream Source[2]ID = 4
ThetaJoin[0,0]==[1,1]ID = 11
Range Window[60]
Stream Source[0]ID = 1
Range Window[60]ID = 3
ThetaJoin[0,0]==[1,1]ID = 11
Range Window[60]
Stream Source[0]ID = 1
Range Window[60]ID = 3
Project [1, 0]ID = 6
Cross (11, 4)ID = 9
Range Window[60]
ThetaJoin[0,0]==[1,1]ID = 11
Range Window[60]
Stream Source[0]ID = 1
ThetaJoin[0,0]==[1,1]ID = 11
20
<queryplan>
</queryplan>
<operator root><class variables> < /class variables><properties></properties><parents></parents><children></children><stream> </stream>
</operator>
Step 4: CAPE XML Plan Writer
XML Plan
CAPEEngine
Optimized Tree CAPE XML Plan Writer
CAPE XML Plan Writer
21
Evaluation Methodology
• Query test bed: – Test individual operators– Test complex query plans
• Evaluation– Manual inspection of generated XML plan– Test XML file on CAPE
22
Evaluation of Individual Operators
• Regular Project
• Function Project
• Select
• Stream Source
• Range Window
• Partition
• Distinct
23
• CQL:
Rstream (Select A
from S
where A =5);
24
• CQL:
rstream (select A + B from S);
25
Conclusion
• Identified query language for CAPE
• Designed a loosely coupled translation frameworks from CQL to CAPE:– Rewrite algebra tree– Generate CAPE XML plans
• Evaluation of generated query plans
26
Future Works
• Implement Relations– Which will maximize CAPE’s capability
• Research on the window size– Support different time range variation
• Implement a Graphical User Interface– Drag and Drop feature to input CQL
27
Acknowledgements
• Prof. Rundensteiner
• Yali Zhu
• Luping Ding
28
Question or Comments?