1 Querying Sensor Networks Sam Madden UC Berkeley November 18 th , 2002 @ Madison
1
Querying Sensor Networks
Sam MaddenUC Berkeley
November 18th, 2002 @ Madison
2
Introduction• What are sensor networks?• Programming Sensor Networks
Is Hard– Especially if you want to build a
“real” application » Example: Vehicle tracking application
took 2 grad students 2 weeks to build and hundreds of lines of code.
• Declarative Queries Are Easy– And, can be faster and more
robust than most applications! » Vehicle tracking query: took 2
minutes to write, worked just as well!
SELECT MAX(mag) FROM sensors WHERE mag > threshSAMPLE INTERVAL 64ms
3
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
4
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
5
Device Capabilities• “Mica Motes”
– 8bit, 4Mhz processor» Roughly a PC AT
– 40kbit radio» Time to send 1 bit = 800 instrs» Reducing communication is good
– 4KB RAM, 128K flash, 512K EEPROM– Sensor board expansion slot
» Standard board has light & temperature sensors, accelerometer, magnetometer, microphone, & buzzer
• Other more powerful platforms exist– E.g. Sensoria WINS nodes
• Trend towards smaller devices– “Smart Dust” – Kris Pister, et al.
6
Sensor Net Sample Apps
Habitat Monitoring. Storm petrels on great duck island, microclimates on James Reserve.
Traditional monitoring apparatus.
Earthquake monitoring in shake-test sites.
Vehicle detection: sensors dropped from UAV along a road, collect data about passing vehicles, relay data back to UAV.
7
Key Constraint: Power• Lifetime from One
pair of AA batteries – 2-3 days at full
power– 6 months at 2%
duty cycle
• Communication dominates cost– Because it takes so
long (~30ms) to send / receive a message
8
TinyOS• Operating system from David Culler’s
group at Berkeley• C-like programming environment
• Provides messaging layer, abstractions for major hardware components– Split phase highly asynchronous, interrupt-
driven programming model
Hill, Szewczyk, Woo, Culler, & Pister. “Systems Architecture Directions for Networked Sensors.” ASPLOS 2000. See http://webs.cs.berkeley.edu/tos
9
Communication In Sensor Nets
• Radio communication has high link-level losses– typically about 20% @ 5m
• Newest versions of TinyOS provide link-level acknowledgments
• No end-to-end acknowledgements
• Ad-hoc neighbor discovery
• Two major routing techniques: tree-based hierarchy and geographic
A
B C
DFE
00 10 21
10 11 12
20 21 22
10
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
11
Declarative Queries for Sensor Networks
• Examples:SELECT nodeid, lightFROM sensorsWHERE light > 400SAMPLE PERIOD 1s
1
2SELECT roomNo, AVG(volume)FROM sensorsGROUP BY roomNoHAVING AVG(volume) > 200
Rooms w/ volume > 200
“epoch”
453 245 512 …Light Temp Accel ….
442 278 513 …406 335 511 …
T-2T-1
T
12
Declarative Benefits In Sensor Networks
• Vastly simplifies execution for large networks– Since locations are described by predicates– Operations are over groups
• Enables tolerance to faults– Since system is free to choose where and
when operations happen• Data independence
– System is free to choose where data lives, how it is represented
13
Computing In Sensor Nets Is Hard
• Why?– Limited power (must optimize for it!)– Lossy communication– Zero administration – Limited processing capabilities, storage, bandwidth
• In power-based optimization, we choose:» Where data is processed.» How data is routed
• Exploit operator semantics!• Avoid dead nodes
» How to order operators, sampling, etc.» What kinds of indices to apply, which data to prioritize …
14
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
15
TinyDB• A distributed query processor for
networks of Mica motes– Available today!
• Goal: Eliminate the need to write C code for most TinyOS users
• Features– Declarative queries– Temporal + spatial operations– Multihop routing– In-network storage
16
A
B C
DFE
TinyDB @ 10000 FtQuery
{D,E,F}
{B,D,E,F}
{A,B,C,D,E,F}
Written in SQL-Like Language With Extensions For :•Sample rate•Offline delivery•Temporal Aggregation
(Almost) All Queries are Continuous and Periodic
17
TinyDB Demo
18
Applications + Early Adopters
• Some demo apps:– Network monitoring– Vehicle tracking
• “Real” future deployments:– Environmental monitoring @ GDI (and James
Reserve?)– Generic Sensor Kit– Building Monitoring
Demo!
19
TinyDB Architecture (Per node)
Radio Stack
Schema
TinyAllloc
TupleRouter
AggOperatorSelOperator
Network
TupleRouter:•Fetches readings (for ready queries)•Builds tuples•Applies operators•Deliver results (up tree)
AggOperator:•Combines local & neighbor readings
SelOperator:•Filters readings
Schema:•“Catalog” of commands & attributes (more later)
TinyAlloc:•Reusable memory allocator!
~10,000 Lines C Code~5,000 Lines Java~3200 Bytes RAM (w/ 768 byte heap)~58 kB compiled code(3x larger than 2nd largest TinyOS Program)
20
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
21
TAG• In-network processing of aggregates
– Aggregates are common operation– Reduces costs depending on type of
aggregates– Focus on “spatial aggregation” (Versus
“temporal aggregation”)
• Exploitation of operator, functional semantics
Tiny AGgregation (TAG), Madden, Franklin, Hellerstein, Hong. OSDI 2002 (to appear).
22
Aggregation Framework• As in extensible databases, we support any
aggregation function conforming to:
Aggn={fmerge, finit, fevaluate}Fmerge{<a1>,<a2>} <a12>finit{a0} <a0>Fevaluate{<a1>} aggregate value(Merge associative, commutative!)
Example: AverageAVGmerge {<S1, C1>, <S2, C2>} < S1 + S2 , C1 + C2>AVGinit{v} <v,1>AVGevaluate{<S1, C1>} S1/C1
Partial State Record (PSR)
Just like parallel database systems – e.g. Bubba!
23
Query Propagation Review
A
B C
D
FE
SELECT AVG(light)…
24
Pipelined Aggregates• After query propagates, during each epoch:
– Each sensor samples local sensors once– Combines them with PSRs from children– Outputs PSR representing aggregate state
in the previous epoch.• After (d-1) epochs, PSR for the whole tree
output at root– d = Depth of the routing tree– If desired, partial state from top k levels
could be output in kth epoch• To avoid combining PSRs from different
epochs, sensors must cache values from children
1
2 3
4
5Value from 5 produced at
time t arrives at 1 at time
(t+3)
Value from 2 produced at
time t arrives at 1 at time
(t+1)
25
Illustration: Pipelined Aggregation
1
2 3
4
5
SELECT COUNT(*) FROM sensors
Depth = d
26
Illustration: Pipelined Aggregation
1 2 3 4 5
1 1 1 1 1 1
1
2 3
4
51
1
11
1Sensor #
Epoc
h #
Epoch 1SELECT COUNT(*) FROM sensors
27
Illustration: Pipelined Aggregation
1 2 3 4 5
1 1 1 1 1 12 3 1 2 2 1
1
2 3
4
51
2
21
3Sensor #
Epoc
h #
Epoch 2SELECT COUNT(*) FROM sensors
28
Illustration: Pipelined Aggregation
1 2 3 4 5
1 1 1 1 1 12 3 1 2 2 13 4 1 3 2 1
1
2 3
4
51
2
31
4Sensor #
Epoc
h #
Epoch 3SELECT COUNT(*) FROM sensors
29
Illustration: Pipelined Aggregation
1 2 3 4 5
1 1 1 1 1 12 3 1 2 2 13 4 1 3 2 14 5 1 3 2 1
1
2 3
4
51
2
31
5Sensor #
Epoc
h #
Epoch 4SELECT COUNT(*) FROM sensors
30
Illustration: Pipelined Aggregation
1 2 3 4 5
1 1 1 1 1 12 3 1 2 2 13 4 1 3 2 14 5 1 3 2 15 5 1 3 2 1
1
2 3
4
51
2
31
5Sensor #
Epoc
h #
Epoch 5SELECT COUNT(*) FROM sensors
31
Grouping• If query is grouped, sensors apply
predicate on each epoch• PSRs tagged with group• When a PSR (with group) is received:
– If it belongs to a stored group, merge with existing PSR
– If not, just store it• At the end of each epoch, transmit one
PSR per group
32
Group Eviction• Problem: Number of groups in any one iteration
may exceed available storage on sensor• Solution: Evict! (Partial Preaggregation*)
– Choose one or more groups to forward up tree– Rely on nodes further up tree, or root, to recombine
groups properly– What policy to choose?
» Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch.
» Experiments suggest:• Policy matters very little• Evicting as many groups as will fit into a single message is good
* Per-Åke Larson. Data Reduction by Partial Preaggregation. ICDE 2002.
33
TAG Advantages• In network processing reduces
communication– Important for power and contention
• Continuous stream of results– In the absence of faults, will converge to
right answer• Lots of optimizations
– Based on shared radio channel– Semantics of operators
34
Simulation Environment• Chose to simulate to allow 1000’s of nodes and
control of topology, connectivity, loss• Java-based simulation & visualization for
validating algorithms, collecting data.• Coarse grained event based simulation
– Sensors arranged on a grid, radio connectivity by Euclidian distance
– Communication model» Lossless: All neighbors hear all messages» Lossy: Messages lost with probability that increases with
distance» Symmetric links» No collisions, hidden terminals, etc.
35
Simulation ResultTotal Bytes Xmitted vs. Aggregation Func tion
0100002000030000400005000060000700008000090000
100000
EXTERNAL MAX AVERAGE COUNT MEDIANAggregation Function
Tota
l Byt
es X
mitt
ed
Simulation Results2500 Nodes50x50 GridDepth = ~10Neighbors = ~20
Some aggregates require dramatically more state!
36
Taxonomy of Aggregates• TAG insight: classify aggregates according to
various functional properties– Yields a general set of optimizations that can
automatically be appliedProperty Examples Affects
Partial State MEDIAN : unbounded, MAX : 1 record
Effectiveness of TAG
Duplicate Sensitivity
MIN : dup. insensitive,AVG : dup. sensitive
Routing Redundancy
Exemplary vs. Summary
MAX : exemplaryCOUNT: summary
Applicability of Sampling, Effect of Loss
Monotonic COUNT : monotonicAVG : non-monotonic
Hypothesis Testing, Snooping
37
Optimization: Channel Sharing (“Snooping”)
• Insight: Shared channel enables optimizations• Suppress messages that won’t affect aggregate
– E.g., in a MAX query, sensor with value v hears a neighbor with value ≥ v, so it doesn’t report
– Applies to all exemplary, monotonic aggregates» Can be applied to summary aggregates also if imprecision is
allowed
• Learn about query advertisements it missed– If a sensor shows up in a new environment, it can
learn about queries by looking at neighbors messages.» Root doesn’t have to explicitly rebroadcast query!
38
Optimization: Hypothesis Testing
• Insight: Root can provide information that will suppress readings that cannot affect the final aggregate value.– E.g. Tell all the nodes that the MIN is definitely <
50; nodes with value ≥ 50 need not participate.– Depends on monotonicity
• How is hypothesis computed?– Blind guess– Statistically informed guess– Observation over first few levels of tree / rounds of aggregate
39
Experiment: Hypothesis Testing
Uniform Value Distribution, Dense Packing, Ideal Communication
Messages/ Epoch vs. Network Diameter(SELECT MAX(attr), R(attr) = [0,100])
0
500
1000
1500
2000
2500
3000
10 20 30 40 50Network Diameter
Mes
sage
s /
Epoc
h
No GuessGuess = 50Guess = 90Snooping
40
Optimization: Use Multiple Parents
• For duplicate insensitive aggregates• Or aggregates that can be expressed as a linear
combination of parts– Send (part of) aggregate to all parents– Decreases variance
» Dramatically, when there are lots of parents
A
B C
A
B C
A
B C
1A
B C
A
B C
1/2 1/2
No splitting:
E(count) = c * p
Var(count) = c2 * p * (1-p)
With Splitting:
E(count) = 2 * c/2 * p
Var(count) = 2 * (c/2)2 * p * (1-p)
41
Multiple Parents Results• Interestingly, this
technique is much better than previous analysis predicted!
• Losses aren’t independent!
• Instead of focusing data on a few critical links, spreads data over many links
Benefit of Result Splitting (COUNT query)
0
200
400
600
800
1000
1200
1400
(2500 nodes, lossy radio model, 6 parents per node)
Avg.
CO
UNT Splitting
No Splitting
Critical Link!
No Splitting With Splitting
42
Fun Stuff• Sophisticated, sensor network
specific aggregates
• Temporal aggregates
43
Temporal Aggregates• TAG was about “spatial” aggregates
– Inter-node, at the same time• Want to be able to aggregate across time as
well• Two types:
– Windowed: AGG(size,slide,attr)
– Decaying: AGG(comb_func, attr)– Demo!
… R1 R2 R3 R4 R5 R6 …
slide =2 size =4
44
Isobar Finding
45
TAG Summary• In-network query processing a big win for
many aggregate functions• By exploiting general functional properties
of operators, optimizations are possible– Requires new aggregates to be tagged with
their properties
• Up next: non-aggregate query processing optimizations – a flavor of things to come!
46
Overview• Sensor Networks• Why Queries in Sensor Nets• TinyDB
– Features– Demo
• Focus: Tiny Aggregation• The Next Step
47
Acquisitional Query Processing
• Cynical question: what’s really different about sensor networks?
–Low Power?–Lots of Nodes?–Limited Processing Capabilities?
Laptops!Distributed DBs!
Moore’s Law!
48
Answer• Long running queries on physically
embedded devices that control when and and with what frequency data is collected!
• Versus traditional systems where data is provided a priori– Next: an acquisitional teaser…
49
ACQP: What’s Different?• How does the user control acquisition?
– Specify rates or lifetimes– Trigger queries in response to events
• Which nodes have relevant data?– Need a node index– Construct topology such that nodes that are queried together
route together• What sensors should be sampled?
– Treat sampling at an operator– Sample cheapest sensors first
• Which samples should be transmitted?– Not all of them, if bandwidth or power is limited– Those that are most “valuable”?
50
Operator Ordering: Interleave Sampling + Selection
SELECT light, magFROM sensorsWHERE pred1(mag)AND pred2(light)SAMPLE INTERVAL 1s
• Energy cost of sampling mag >> cost of sampling light•1500 uJ vs. 90 uJ
• Correct ordering (unless pred1 is very selective):
At 1 sample / sec, total power savings could be as much as 4mW, same as the processor!
2. Sample light Apply pred2Sample magApply pred1
1. Sample light Sample magApply pred1Apply pred2
3. Sample mag Apply pred1Sample lightApply pred2
51
Optimizing in ACQP• Model sampling as an “expensive predicate” • Some subtleties:
– Attributes referenced in multiple predicates; which to “charge”?
– Attributes must be fetched before operators that use them can be applied
• Solution: – Treat sampling as a separate task– Build a partial order on sampling and predicates – Solve for cheapest schedule using series-parallel scheduling
algorithm (Monma & Sidney, 1979.), as in other optimization work (e.g. Ibaraki & Kameda, TODS, 1984, or Hellerstein, TODS, 1998.)
52
Exemplary Aggregate Pushdown
SELECT WINMAX(light,8s,8s)FROM sensorsWHERE mag > xSAMPLE INTERVAL 1s
Unless > x is very selective, correct ordering is:Sample lightCheck if it’s the maximumIf it is:
Sample magCheck predicateIf satisfied, update maximum
53
Summary• Declarative queries are the right interface for
data collection in sensor nets!• Aggregation is a fundamental operation for which
there are many possible optimizations– Network Aware Techniques
• Current Research: Acquisitional Query Processing – Framework for addresses lots of the new issues that
arise in sensor networks, e.g.» Order of sampling and selection» Languages, indices, approximations that give user control
over which data enters the system
TinyDB Release Available - http://telegraph.cs.berkeley.edu/tinydb
54
Questions?
55
Simulation Screenshot
56
TinyAlloc• Handle Based Compacting Memory Allocator• For Catalog, Queries
Free Bitmap
HeapMaster Pointer
Table
Handle h;
call MemAlloc.alloc(&h,10);
…
(*h)[0] = “Sam”;
call MemAlloc.lock(h);
tweakString(*h);
call MemAlloc.unlock(h);
call MemAlloc.free(h);
User Program
Free Bitmap
HeapMaster Pointer
Table
Free Bitmap
HeapMaster Pointer
Table
Free Bitmap
HeapMaster Pointer
Table
Compaction
57
Schema• Attribute & Command IF
– At INIT(), components register attributes and commands they support
» Commands implemented via wiring» Attributes fetched via accessor command
– Catalog API allows local and remote queries over known attributes / commands.
• Demo of adding an attribute, executing a command.
58
Q1: Expressiveness• Simple data collection satisfies most
users• How much of what people want to do is
just simple aggregates?– Anecdotally, most of it– EE people want filters + simple statistics
(unless they can have signal processing)• However, we’d like to satisfy everyone!
59
Query Language• New Features:
– Joins– Event-based triggers
»Via extensible catalog– In network & nested queries– Split-phase (offline) delivery
»Via buffers
60
Sample Query 1Bird counter:CREATE BUFFER birds(uint16 cnt)
SIZE 1
ON EVENT bird-enter(…)SELECT b.cnt+1FROM birds AS bOUTPUT INTO bONCE
61
Sample Query 2Birds that entered and left within time t of each other:
ON EVENT bird-leave AND bird-enter WITHIN tSELECT bird-leave.time, bird-leave.nestWHERE bird-leave.nest = bird-enter.nestONCE
62
Sample Query 3Delta compression:
SELECT light FROM buf, sensorsWHERE |s.light – buf.light| > tOUTPUT INTO bufSAMPLE PERIOD 1s
63
Sample Query 4Offline Delivery + Event ChainingCREATE BUFFER equake_data( uint16 loc, uint16 xAccel, uint16 yAccel)
SIZE 1000PARTITION BY NODE
SELECT xAccel, yAccelFROM SENSORSWHERE xAccel > t OR yAccel > tSIGNAL shake_start(…)SAMPLE PERIOD 1sON EVENT shake_start(…)
SELECT loc, xAccel, yAccelFROM sensorsOUTPUT INTO BUFFER equake_data(loc, xAccel, yAccel)SAMPLE PERIOD 10ms
64
Event Based Processing• Enables internal and chained actions• Language Semantics
– Events are inter-node– Buffers can be global
• Implementation plan– Events and buffers must be local– Since n-to-n communication not (well)
supported• Next: operator expressiveness
65
Attribute Driven Topology Selection
• Observation: internal queries often over local area*– Or some other subset of the network
»E.g. regions with light value in [10,20]• Idea: build topology for those queries
based on values of range-selected attributes– Requires range attributes, connectivity to be
relatively static* Heideman et. Al, Building Efficient Wireless Sensor Networks With Low Level Naming. SOSP, 2001.
66
Attribute Driven Query Propagation
1 2 3
4
[1,10][7,15]
[20,40]
SELECT …
WHERE a > 5 AND a < 12
Precomputed intervals == “Query Dissemination Index”
67
Attribute Driven Parent Selection
1 2 3
4
[1,10] [7,15] [20,40]
[3,6]
[3,6] [1,10] = [3,6]
[3,7] [7,15] = ø
[3,7] [20,40] = ø
Even without intervals, expect that sending to parent with closest value will help
68
Hot off the press…Nodes Vi s i t ed vs. Range Quer y Si ze f or
Di ff er ent I ndex Pol i ci es
050
100150200250300350400450
0.001 0.05 0.1 0.2 0.5 1Quer y Si ze as % of Val ue Range
( Random val ue di st r i but i on, 20x20 gr i d, i deal connect i vi t y t o ( 8) nei ghbor s)
Numb
er o
f No
des
Visi
ted
(400
= M
ax)
B est Case (Expec ted)C loses t ParentNeares t V alueSnooping
69
Grouping• GROUP BY expr
– expr is an expression over one or more attributes» Evaluation of expr yields a group number» Each reading is a member of exactly one group
Example: SELECT max(light) FROM sensorsGROUP BY TRUNC(temp/10)
Sensor ID Light Temp Group1 45 25 22 27 28 23 66 34 34 68 37 3
Group max(light)2 453 68
Result:
70
Having• HAVING preds
– preds filters out groups that do not satisfy predicate
– versus WHERE, which filters out tuples that do not satisfy predicate
– Example: SELECT max(temp) FROM sensors GROUP BY light HAVING max(temp) < 100
Yields all groups with temperature under 100
71
Group Eviction• Problem: Number of groups in any one iteration may
exceed available storage on sensor• Solution: Evict!
– Choose one or more groups to forward up tree– Rely on nodes further up tree, or root, to recombine groups
properly– What policy to choose?
» Intuitively: least popular group, since don’t want to evict a group that will receive more values this epoch.
» Experiments suggest:• Policy matters very little• Evicting as many groups as will fit into a single message is good
72
Experiment: Basic TAG
Dense Packing, Ideal Communication
Bytes / Epoch vs. Network Diameter
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
10 20 30 40 50Network Diameter
Avg.
Byt
es /
Epo
ch
COUNTMAXAVERAGEMEDIANEXTERNALDISTINCT
73
Experiment: Hypothesis Testing
Uniform Value Distribution, Dense Packing, Ideal Communication
Messages/ Epoch vs. Network Diameter
0
500
1000
1500
2000
2500
3000
10 20 30 40 50Network Diameter
Mes
sage
s /
Epoc
h
No GuessGuess = 50Guess = 90Snooping
74
Experiment: Effects of Loss
Percent Error From Single Loss vs. Network Diameter
0
0.5
1
1.5
2
2.5
3
3.5
10 20 30 40 50Network Diameter
Perc
ent E
rror
Fro
m S
ingl
e Lo
ss
AVERAGECOUNTMAXMEDIAN
75
Experiment: Benefit of Cache
Percentage of Network I nvolved vs. Network Diameter
0
0.2
0.4
0.6
0.8
1
1.2
10 20 30 40 50Network Diameter
% N
etw
ork
No Cache5 Rounds Cache9 Rounds Cache15 Rounds Cache
76
Pipelined Aggregates• After query propagates, during each epoch:
– Each sensor samples local sensors once– Combines them with PSRs from children– Outputs PSR representing aggregate state in
the previous epoch.• After (d-1) epochs, PSR for the whole tree
output at root– d = Depth of the routing tree– If desired, partial state from top k levels
could be output in kth epoch• To avoid combining PSRs from different epochs,
sensors must cache values from children
1
2 3
4
5Value from 5 produced at
time t arrives at 1 at time
(t+3)
Value from 2 produced at
time t arrives at 1 at time
(t+1)
77
Pipelining Example
1
2
43
5
SID Epoch Agg.
SID Epoch Agg.
SID Epoch Agg.
78
Pipelining Example
1
2
43
5
SID Epoch Agg.2 0 14 0 1
SID Epoch Agg.1 0 1
SID Epoch Agg.3 0 15 0 1
Epoch 0
<5,0,1>
<4,0,1>
79
Pipelining Example
1
2
43
5
SID Epoch Agg.2 0 14 0 12 1 14 1 13 0 2
SID Epoch Agg.1 0 11 1 12 0 2
SID Epoch Agg.3 0 15 0 13 1 15 1 1
Epoch 1
<5,1,1>
<4,1,1><3,0,2>
<2,0,2>
80
Pipelining Example
1
2
43
5
SID Epoch Agg.2 0 14 0 12 1 14 1 13 0 22 2 14 2 13 1 2
SID Epoch Agg.1 0 11 1 12 0 21 2 12 0 4
SID Epoch Agg.3 0 15 0 13 1 15 1 13 2 15 2 1
Epoch 2
<5,2,1>
<4,2,1><3,1,2>
<2,0,4>
<1,0,3>
81
Pipelining Example
1
2
43
5
SID Epoch
Agg.
2 0 14 0 12 1 14 1 13 0 22 2 14 2 13 1 2
SID Epoch Agg.1 0 11 1 12 0 21 2 12 0 4
SID Epoch Agg.3 0 15 0 13 1 15 1 13 2 15 2 1
Epoch 3
<5,3,1>
<4,3,1><3,2,2>
<2,1,4>
<1,0,5>
82
Pipelining Example
1
2
43
5
Epoch 4
<5,4,1>
<4,4,1><3,3,2>
<2,2,4>
<1,1,5>
83
Our Stream Semantics• One stream, ‘sensors’• We control data rates• Joins between that stream and buffers are
allowed• Joins are always landmark, forward in time, one
tuple at a time– Result of queries over ‘sensors’ either a single tuple
(at time of query) or a stream• Easy to interface to more sophisticated systems• Temporal aggregates enable fancy window
operations
84
Formal Spec.ON EVENT <event> [<boolop> <event>... WITHIN <window>]
[SELECT {<expr>|agg(<expr>)|temporalagg(<expr>)} FROM [sensors | <buffer> | events]] [WHERE {<pred>}] [GROUP BY {<expr>}] [HAVING {<pred>}] [ACTION [<command> [WHERE <pred>] |
BUFFER <bufname> SIGNAL <event>({<params>}) | (SELECT ... ) [INTO BUFFER <bufname>]]]
[SAMPLE PERIOD <seconds> [FOR <nrounds>] [INTERPOLATE <expr>] [COMBINE {temporal_agg(<expr>)}] |
ONCE]
85
Buffer Commands
[AT <pred>:]CREATE [<type>] BUFFER <name> ({<type>})PARTITION BY [<expr>]SIZE [<ntuples>,<nseconds>][AS SELECT ...
[SAMPLE PERIOD <seconds>]]
DROP BUFFER <name>