ICEDB Intermittently Connected Query Processing Yang Zhang, Bret Hull, Hari Balakrishnan, and Samuel Madden MIT Computer Science and AI Lab April 17, 2007
Jan 11, 2016
ICEDBIntermittently Connected
Query Processing
Yang Zhang, Bret Hull,Hari Balakrishnan, and Samuel Madden
MIT Computer Science and AI Lab
April 17, 2007
Motivation: CarTel Project
Motivation: CarTel Project
Motivation: CarTel Project
Challenges
What data management
architecture is best suited for mobile,
wide-area sensing?
Higher data rates• Cannot send all
data back• Must differentiate
the data
Intermittent,variable connectivity• Wi-Fi hotspots• EVDO cellular
Solution: ICEDB
Intermittently Connected Embedded Database
In-network query
processing
Buffering and
prioritization
Intermittently
Connected Embedded Database
Roadmap
Introduction and Motivation ICEDB Design Result Prioritization Experimental Evaluation Conclusion
ICEDB Server(portal)
ICEDB Design: Overview
Example QueryShow me photos of traffic
jams.No duplicates.
ICEDB Design: Overview
results
data sources
ICEDB Remote ICEDB Server(portal)
wireless connection
queries
sensors
+
#!/usr/bin/perl
while (true) { raw = read(serial); tuple = convert(raw); send(icedb, tuple);}
adapter schema
data sourceName Type
latitude double precision
longitude double precision
altitude double precision
…
time time
ICEDB Design: Remote Node
sensor
ADAPTER DB
CQ
Ad-hocQuery
Processor
OutputBuffers
CAFNET
ICEDB Design: Queries
SELECT ...EVERY n [SECONDS]BUFFER IN buffername
ICEDB Design: Queries
SELECT ...EVERY n [SECONDS]BUFFER IN buffername
ICEDB Design: Queries
SELECT ...EVERY n [SECONDS]BUFFER IN buffername
tuplesbufferqueryDB
Roadmap
Introduction and Motivation ICEDB Design Result Prioritization Experimental Evaluation Conclusion
remote nodes
Result Prioritization
PRIORITY rank,weight: inter-query (local) DELIVERY ORDER BY: intra-query (local) SUMMARIZE AS: global
tuplesbuffer
Local Prioritization:DELIVERY ORDER BY Background process Dynamic orderings UDF API: direct access to buffers Example:
SELECT photo FROM cameraBUFFER IN camera_bufDELIVERY ORDER BY bisect
FIFO
Bisect
Global Prioritization
Local prioritization is limitedE.g. users interested in different
prioritizationE.g. different nodes carrying redundant data
SUMMARIZE AS
summary
prioritization
ICEDB Serverprioritized resultsICEDB Remote
Global Prioritization Example: get speeds, maximizing coverageSELECT lat, lon, ins_time, speedFROM gps BUFFER IN gps_bufSUMMARIZE AS SELECT floor(lat/.001), floor(lon/.001), floor(ins_time/300) FROM gps_buf GROUP BY floor(lat/.001), floor(lon/.001), floor(ins_time/300)
Global Prioritization
4 6
3
2
1
5fromcentralserver
tocentralserver
ICEDB Server
lat lon ins_time
31.415 27.182 7:30pm
31.423 27.179 7:35pm
… … …
lat lon ins_time
rank
31.423 27.179 7:35pm 1
31.415 27.182 7:30pm 2
… … … …
Roadmap
Introduction and Motivation ICEDB Design Result Prioritization Experimental Evaluation Conclusion
Trace-Driven Simulation 232 days of normal driving (07/05 – 07/06) Boston and Seattle areas 260 distinct km of roads, 50% from 15 km 32,000 APs discovered, 2,000 open Mean time between APs: 23 seconds Mean association duration: 24 seconds Median TCP upload: ~200 kbytes Connectivity is equi-probable in [0,60] km/h
Experimental Evaluation: Setup Query workloads: uniform, hotspot Camera data: 50KB Metric: fraction query points satisfied Prioritization schemes:
FIFO, bisect, random, global Cars: one, many
query point query point
Experimental Evaluation: Setup Query workloads: uniform, hotspot Camera data: 50KB Metric: fraction query points satisfied Prioritization schemes:
FIFO, bisect, random, global Cars: one, many
query point query point
Experimental Evaluation: Results FIFO: zero success Random/bisect: ~0.25x success of
global Bottleneck: not query count, but total
network capacity Global: remote nodes and central
server share data
Conclusion Challenges: data management in
intermittently connected, constrained-bandwidth environment
ICEDB: distributed, delay-tolerant query processing
Central declarative interface simplifies complicated network data prioritization problems
http://cartel.csail.mit.edu/
ICEDBIntermittently Connected
Query Processing
Yang Zhang, Bret Hull,Hari Balakrishnan, and Samuel Madden
MIT Computer Science and AI Lab
http://cartel.csail.mit.edu/