5/28/2010 1 Distributed Cyber-Physical Systems Tarek Abdelzaher University of Illinois at Urbana Champaign Where is Computer Science Research Going? Core The beginning: C entralized machines
5/28/2010
1
Distributed Cyber-PhysicalSystems
Tarek AbdelzaherUniversity of Illinois at Urbana Champaign
Where is Computer ScienceResearch Going?
Core
The beginning:C entralizedmachines
5/28/2010
2
Where is Computer ScienceResearch Going?
C entralizedmachines
Core
Where is Computer ScienceResearch Going?
C entralizedmachines
Core
5/28/2010
3
Where is Computer ScienceResearch Going?
C entralizedmachines
Core
Cyber-PhysicalComputing
Where is Computer ScienceResearch Going?
C entralizedmachines
Core
Cyber-Physical(Distr ibuted) Computing
5/28/2010
4
Where is Computer ScienceResearch Going?
DistributedCyber-Physical Systems
For example:
In the US, the Presidential Counsel ofAdvisors in Science and Technologynamed systems that interact with the physicalworld the
#1 Research Priority in the US
SituationA ware
DistributedEmbedded
Sy stems
Emergency
Response
Distributed Cyber-Physical SystemsThe Next Frontier
Today
C lusters, F armsGrids, WWW
EmbeddedEvery where- Transparent- Context-aware- Mobile- Miniature- Ubiquitous(Smart attire,smart spaces,
…)
A utonomic
C omputing
PrivacySecurity
Processors
Cyber-PhysicalSystems
5/28/2010
5
Wearable and AmbientSensors: OpportunisticContext Measurement
ImplantableMedical Devices,
Biosensors
BiologicalSystem
BiometricSensing
C ontext andA ctivity Sensing,NEAT-factor, etc.
LoggingLogging
Community/SocialNetwork
DirectC ontrol
C ontext Factors,Bio-feedback
SanitizedC ommunity
Data
Personal DataServices
SocialF actors
Bio-loops
Medical careservices
CareProviders,Physicians
C omparativeA naly sis
Personalized Healthcare
Point of CareDevicesBio-feedback Sensors
Implanted SensorsInsulin pumps, pacemakers,
glucose monitors, …
Smart SpacesWearable Activityand Biometric Monitoring
Human-CentricSensing
Micro- and Nano-sensors, Biochips
TransparentTesting
5/28/2010
6
Browsing the Physical WorldFeng Zhaohttp://atom.research.microsoft.com/sensormap/
12
Signal data
Visual data
Human data
Extractobjects and
linkages
InformationNetwork
Optimizeresourceallocation
Sensors, w itnesses, sources, …
Information
High QoI,Quantifiableuncertainty
PrioritizedSituation-awareness
Extractobjects and
linkages
Feedbackto sensor andcommunicationnetworks
Socialnetworks
Military Applications
5/28/2010
7
Confluence of TrendsThe Overarching Challenge
Trend1: Device/Data Proliferation(by Moore’s Law)
Trend2: Integration at Scale(Isolation has cost)
Trend3: A utonomy(Humans are not getting faster)
Confluence of TrendsThe Overarching Challenge
Trend1: Device/Data Proliferation(by Moore’s Law)
Trend2: Integration at Scale(Isolation has cost)
Trend3: A utonomy(Humans are not getting faster)
Distributed Cyber-PhysicalSystems
5/28/2010
8
Confluence of TrendsThe Overarching Challenge
Trend1: Device/Data Proliferation(by Moore’s Law)
Trend2: Integration at Scale(Isolation has cost)
Trend3: A utonomy(Humans are not getting faster)
Distributed Cyber-PhysicalSystems
InteractionChallenges
Interaction Domains
In distributed cyber-physical systems, computation,communication and sensing interact in severaldomains: Part I: Temporal interactions
Real-time systems
Part II: Spatial interactions Sensor networks
Part III: Social interactions Human-centric CPS
5/28/2010
9
Part I: Temporal Interactions(Distributed Real-time CPS)
Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign
Applications: Mission-critical Systems
Building Timely, Predictable, Reliable Systems
5/28/2010
10
Temporal Analysis
Periodic tasks
Known workload
Small sy stems
Microscopic execution models
C omplex analy sis for non-trivial cases
Past Future
A periodically arriving tasks
Largely unpredictable workload
Large distributed sy stems
A ggregate execution models
S imple back-of-the-envelope analysis
Focus of the Real-Time community : Needed solutions:
Fundamental question: How to determine in a cyber-physical systemthat timing and throughput requirements are met?
(Optimality, accuracy, moderate scale) (Simplicity, sufficiency, large scale)
(Courtesy of Lockheed Martin, 2002)
Motivating Application (US Navy)Courtesy of Patrick Lardieri, ATL, Lockheed Martin
FUTURE – Total Ship Computing (TSCE)•1,000s of computer nodes connected bystandard/COTSmiddleware on distributedswitched backplane
•N-version redundancy (no single failure point)•Virtually unlimited growth capability
•Software replicated on many CPUs/nodes
•Essentially invulnerable to battle damage
UNDER DEVELOPMENT –Networked Processing (Aegis Baseline 7)
•Open HW + operating system(COTS/industry standards)
•Distributed LAN interconnects•Redundancy plus reconfigurability•Significant growth capability
•Software distribution possible•Vulnerable to large scale damage•Highly constrainted by legacy stove-pipesystems
TODAY - Adjunct Processing (Aegis Baselines 5P3, 6)•UYK-43s w/COTSprocessors
•Point-to-point interconnection•Display LAN
•Limited growth capability
•Vulnerable to damage
Moving from customstovepipe infrastructures
to a common, COTSbased infrastructure
5/28/2010
11
Total Ship ComputingEnvironment (TSCE) VisionCourtesy of Patrick Lardieri, ATL, Lockheed Martin (2002)
TSCE Design Goals Use COTS Infrastructure
Technology
Enable Plug-n-PlayComponent Architecture
TSCE Benefits Improve Performance
Increase Extensibility
Break Apart ApplicationStovepipes
TSCE Design Challenges
• Using shared computing and networking resources
• Satisfying a mix of hard and soft real-time performance requirementsfor periodic and aperiodic tasks
• Assuring mission and safety critical processing will always meet theirdeadlines
Radar Scheduling:Middleware and Applications
Communication, Alarm Handling inNetworked Cooperative Engagement
Threat DiscoveryTarget Tracking
WeaponControl
• Unpredictable environment
• Time/QoS constraints
• Limited system capacity
5/28/2010
12
Target Scenario
Target
negotiationSchedulabilityAnalysis
TargetAssignment
SchedulabilityAnalysis
A distributed system of multiple battleships
A multifunction radar on each ship (node)
A set of targets to be jointly tracked
Goal: Each target should be tracked by atleast one node and at most K nodes.
The First Prototype
Zumwalt Class Destroyer Total Ship Computing Environment
5/28/2010
13
Challenge
Establish a new analytic foundation forrobust timing guarantees in highlydynamic, largely unpredictable, time-critical software systems
Methodology
Develop a more robust schedulabilitytheory
A schedulability theory for aperiodic tasks(fewer rigid assumptions)
Extend the theory to distributed systems
Analyze end-to-end behavior of complex taskgraphs
5/28/2010
14
Goal: Analyze DistributedHighly Coupled Systems
C omplex timing behavior:Every thing depends onevery thing
©
Historical Perspective:Utilization Bounds
Consider a set of periodic tasks
Each task invocation must finish by the endof its period
How to tell if all invocations will meet theirdeadlines? Period=Deadline
5/28/2010
15
©
Liu and Layland UtilizationBound
Well-known result (1973): Assume that each task Ti executes for Ci every
period Pi.
Processor utilization needed for this task is:
Ui = Ci/Pi
The task set is schedulable by an optimalfixed-priority scheduling policy if Si Ci/Pi < 0.69
Optimal fixed priority policy is rate-monotonic(higher rate = higher priority)
Real-TimeOverview
Deadline=Period Deadline<Period
Rate Monotonic EDF
BoundsO ptimality
ResultBound
O ptimalityResult
Deadline Monotonic EDF
Bound(Poor)
Per TaskTests
S imple Recursive
ProcessorDemand
C lassical Hy perbolic
Real-time Tasks
Periodic Tasks
5/28/2010
16
Deadline=Period Deadline<Period
Rate Monotonic EDF
BoundsO ptimality
ResultBound
O ptimalityResult
Deadline Monotonic EDF
Bound(Poor)
Per TaskTests
S imple Recursive
ProcessorDemand
C lassical Hy perbolic
A periodic Tasks
Real-time Tasks
F ixed-Priority Dy namic-Priority
Polling
Slack Steal. Priority Ex.
Deferrable Sporadic
Total B.
Sporadic
DPE
C BS
TBS+
IPE
EDL
Periodic Tasks
Real-TimeOverview
Periodic Tasks
Deadline=Period Deadline<Period
Rate Monotonic EDF
BoundsO ptimality
ResultBound
O ptimalityResult
Deadline Monotonic EDF
Bound(Poor)
Per TaskTests
S imple Recursive
ProcessorDemand
C lassical Hy perbolic
Real-time Tasks
Treat asA periodic
5/28/2010
17
Why An Aperiodic Theory forReal-Time Systems?
Reason #1: Aperiodic tasks are an increasingproportion of workload. They are no longerthe “exception” Consider a server serving randomly arriving
requests
Each request has a desired response time
Can one invent an aggregate measurableutilization-like metric (we call it instantaneousutilization, U), such that all deadlines are met aslong as U is below some threshold, Umax?
Feasible region: 0 < U < Umax
Why An Aperiodic Theory forReal-Time Systems?
Reason #2: Even systems where tasks arriveperiodically suffer aperiodic artifacts if thereare multiple execution stages
Stage 1 Stage 2 Stage 3
PeriodicA rrivals
No longerperiodic
Even lessperiodic
Look completelyaperiodic
5/28/2010
18
Aperiodic Task andInstantaneous Utilization
Instantaneous utilization U(t) is a function of time, t
U(t) is defined over the current invocations
U(t) = Si Ci/Di
D1
D3
D2
Aperiodic Task andInstantaneous Utilization
Instantaneous utilization U(t) is a function of time, t
U(t) is defined over the current invocations
U(t) = Si Ci/Di
D1
D3
D2
A rrived but deadlinehas not expired
5/28/2010
19
The Single Queue Problem:Relaxing the Periodicity Assumption
Consider processing requests that arrive at a resource queue in someaperiodic pattern Each request needs a processing time Ci
Each request must meet a latency bound D i
Is there a bound on U(t) = Si Ci/D i (instantaneousutilization) of eligible requests such they areschedulable if this bound is not exceeded?
Note: The average utilization of the resource(percentage non-idle time) is equal to the averageof sy nthetic utilization U(t) (of accepted tasks)
Solution: A set of aperiodic tasks is schedulable using an optimal fixed-priority policy if:
Allows reasoning about temporal correctness and real-time throughput ofapplications with irregular workloads
22)( tU
Arrival curve
Expiration curve
U(t)
Time
Cumulative utilization
Ci/Di
Observation
Utilization bound for aperiodic tasks issmaller than the Liu and Layland boundfor periodic tasks. Why?
5/28/2010
20
Example
Consider two tasks
D
D
D/2
U(t) = 0.75
Main Idea of Derivation
Minimize, over all arrival patterns z , themaximum Uz(t) that precedes a latencyviolation
LatencyViolation
Uz(t)= Si Ci/Di (of eligible requests)
t
MaximumUz(t)
5/28/2010
21
Derivation
Observe that each request i contributes Ci tothe area under the Uz (t) curve – see figurebelow.
Uz(t)
t
MaximumUz(t)
D C / Di i i
LatencyViolation
Corollary
The total area under the Uz (t) curve is S Ci
over all arrived tasks*
t
LatencyViolation
D C / Di i i
MaximumUz(t)
Uz(t)
*Note: that’s why the average U(t) is equal to the average resource utilization
5/28/2010
22
A Geometric Interpretation
The sum S Ci (area) must be at least equal to somedeadline Dn if latency violations occur
Minimize curve hight given area = Dn
Dn
U
t
LatencyViolation
D C / Di i i
MaximumUz(t)
Uz(t)
A Geometric Interpretation
The sum S Ci (area) must be at least equal to somedeadline Dn if latency violations occur
Minimize curve hight given area = Dn
Dn
Dn
<Dn
U
t
LatencyViolation
D C / Di i i
MaximumUz(t)
Uz(t)
5/28/2010
23
A Realizable Worst-CaseArrival Pattern
Dn
Dmax=Dn-
A rea = Dn
Uz(t)
A Realizable Worst-CaseArrival Pattern
Ci/Dn
Ci
Dn
A rea = Dn
Uz(t)
5/28/2010
24
A Realizable Worst-CaseArrival Pattern
Ci/Dn
Ci
Dn
A rea = Dn
Uz(t) Slope = 1/Dn
A Realizable Worst-CaseArrival Pattern
Ci/Dn
Ci
Dn
A rea = Dn
Uz(t) Slope = 1/Dn
How big are the individual tasks when the bound is minimized?
5/28/2010
25
A Realizable Worst-CaseArrival Pattern
Dn
A rea = Dn
Uz(t) Slope = 1/Dn
How big are the individual tasks when the bound is minimized?
Basic Result
Area cannot be rectangular because task departuresgradually decrease utilization
Proved that minimum-height curve is a trapezoid withslope 1/Dn
Dn
Slope = 1/DnU
22U
Dn <Dn
22
024
2/)22(2
U
UU
DUDDDU nnnn
5/28/2010
26
Extensions
211)( tU
Dn
Slope = 1/DnU
a is the urgency inversion (the minimum ratio of deadline of low priority task todeadline of high priority task which can preempt it).
g = max (Blocking/D) is the maximum time a task can be blocked (by a lowerpriority task) relative to its deadline.
U t( ) 1 1 2 2
Dn <Dn/a
Dn-g
Slope = 1/DnU
Dn <Dn/a
O ther scheduling policies:
Multi-stage Resource Pipelines
Can one derive a utilization-basedexpression for schedulability of a pipeline?
Machine 1 Machine 2
U1 U2
U1
U2
5/28/2010
27
Multi-stage Resource Pipelines
Dn
Slope = 1/DnU j
L
Assume delay of task n on stage j does not exceed L
The sum S Ci under the utilization curve is at least L.
Stage 1 Stage N
Stage j
Stagej–1
Stagej+1
… …Queue
Resource
Dn
L
The Stage Delay Theorem
Slope = 1/Dn
A rea = LU j
L Dn
Consider the worst case pattern:
5/28/2010
28
The Stage Delay Theorem
U
UU
D
L
LUUUD
LUDDLU
n
n
nn
1
)2/1(
)1()2/1(
2/)22(
Consider the worst case pattern:
Slope = 1/Dn
A rea = LU j
L Dn
Constructing Feasible Regions:The Stage Delay Theorem
The Stage Delay Theorem: If the instantaneousutilization of resource i, does not exceed Ui, thenno task is queued on resource i under deadlinemonotonic scheduling for more than a fraction bof its end-to-end deadline, where:
b = Ui (1 – Ui /2)/(1 – Ui)
Resourcei
Ui
b U(t) = Si Ci/Di
D1
D2
D3
Instantaneous Utilization
5/28/2010
29
Main Result: Schedulability ofResource Pipelines
Let U1, U2, …, Un be the instantaneous utilizationvalues of n stages in a pipeline
All end-to-end deadlines are met if:
Stage 1 Stage nStage 2
U1 U2 Un
…
n
i i
ii
U
UU
1
11
)21(
Schedulability Regions ofArbitrary Task Graphs
Let U1, U2, …, Un be the synthetic utilization values of nservers S1, S2, …, Sn in a distributed system
Client requests traverse a task graph with multiple flows
Client requests meet their end-to-end deadlines if:
pathS i
iipath
iU
UU1
1
)21(max
Path 1
Path 2
Path 3
5/28/2010
30
Example
Is every task T schedulable in the pipeline below?
Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20
Task Stage1 C omp. Time Stage2 C omp. Time Deadline
Example
Is every task T schedulable in the pipeline below?
U = 0.5 U = 0.2
Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20
Task Stage1 C omp. Time Stage2 C omp. Time Deadline
5/28/2010
31
Example
Is every task T schedulable in the pipeline below?
U = 0.5 U = 0.2
Task 1 0.1 0.1 1Task 2 0.2 0.1 2Task 3 1 0.125 5Task 4 2 0.5 20
Task Stage1 C omp. Time Stage2 C omp. Time Deadline
0.5(1-0.25)/0.5 + 0.2(1-0.1)/0.8=0.75+0.225=0.925<1 O K!
References
Utilization bound for aperiodic tasks:
Tarek A bdelzaher, Vivek Sharma, Chenyang Lu, “A Utilization Bound forA periodic Tasks and Priority Driven Scheduling,” IEEE Transactions onC omputers, V ol. 53, No. 3, March 2004. Earlier version appeared in: TarekA bdelzaher, C henyang Lu, “Schedulability Analysis and Utilization Bounds forHighly Scalable Real-Time Services,” IEEE Real-Time Technology andA pplications Sy mposium,TaiPei, Taiwan, June 2001.
Extension to resource requirements:
Tarek A bdelzaher and V ivek Sharma, “A Synthetic Utilization Bound for A periodicTasks w ith Resource Requirements,” 5th Euromicro C onference on Real-TimeSy stems, Porto, Portugal, July 2003.
Extension to pipelines:
Tarek A bdelzaher, Gautam Thaker, Patrick Lardieri, “A Feasible Region forMeeting A periodic End-to-end Deadlines in Resource P ipelines,” IEEEInternational C onference on Distributed C omputing Systems,Tokyo, Japan,March 2004.
5/28/2010
32
Capacity Planning for Real-TimeWireless Sensor Networks
Seminal recent work established wireless network capacity bounds What if traffic has latency requirements and only bits that make it
by the latency constraint are counted towards throughput? Problem: express real-time network capacity that quantifies the
throughput of timely bits only as a function of network parametersand time constraints
A rea S
n nodes
SamplingRate?
Defining Real-time NetworkCapacity
Network bandwidth is the bottleneck
Intuitively, network ability to meet time constraintsdecreases with:
Increased data size, C
Increased distance between source and destination, L
Decreased end-to-end latency constraint, D
Schedulability decreases with CL/D
Is there a bound CapacityRT, such that all packets, i, reachtheir destinations by their deadlines if:
RTi i
ii CapacityD
LC
5/28/2010
33
The Single Hop Problem
B
A
Receiver’sRadio Range
EquivalentVirtual Queue
Only one node in the vicinity of a receiver cansend at a time
Throughput Optimization:Maximizing Real-Time Capacity
Consider a localized communication pattern where eachnode communicates with nodes at most N hops away.
On any path, maximize Sj Uj subject to
From symmetry, Uj = U
Hence,
2)1(111 NNU
N
j j
jj
U
UU
1
11
)21(
NU
UU/1
1
)2/1(
5/28/2010
34
Real-Time Capacity
The total capacity theorem: In a load-balancednetwork of n nodes, each with a radio of transmissionspeed W and m neighbors on average, ifcommunication is localized within at most N hops:
For large N:
2)1(111
NNm
nWCapacity
Opt
mN
nWCapacityOpt
Real-time Capacity of Multi-hopData-Collection in Sensor Networks
In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, what is a sufficient bound on real-timecapacity?
5/28/2010
35
Data conservationconstraints
Real-time Capacity of Multi-hopData-Collection in Sensor Networks
In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, what is a sufficient bound on real-timecapacity?
jUU
UUj
pathj j
jj /1,11
)21(
N
KNWCapacityDC
ln5.01
K=3N=2
Real-time Capacity of Multi-hopData-Collection in Sensor Networks
In a data collection sensor network with K collectionpoints, maximum path length N, and radio transmissionspeed W, a sufficient bound on real-time capacity is:
5/28/2010
36
N
KNWCapacityDC
ln2
Extensions
Contention due to nodes outside the receiver’s neighborhood cuts capacity in half (worst case).
Interference region (area) m i is larger than the reception region mr scalescapacity down by mr/m i.
N
KNW
m
mCapacity
i
rDC ln2
O ther considerations:
MAC-layer overhead Different traffic prioritization policies
Evaluation: Effect of the Numberof Sinks on Schedulability
Simulation versus analytic prediction of the onset ofdeadline misses in a 1600 node network
5/28/2010
37
Evaluation: Effect of the SensorRadio Radius on Schedulability
Simulation versus analytic prediction of the onset ofdeadline misses in a 1600 node network
Experimental Evaluation
Comparing empirically measured and theoretical real-timecapacity on a testbed of Mica2 motes
TDMA-based MAC layer with locally prioritized queues
0
2
4
6
8
10
12
14
9 nodes 16 nodes 25 nodes
Thoeretical
Empirical
5/28/2010
38
References
Real-time capacity:
Tarek F. Abdelzaher, Shashi Prabh, Raghu Kiran, "On Real-time Capacity Limits of Multihop Wireless Sensor Networks,"IEEE Real-time Systems Symposium, Lisbon, Portugal,December 2004.
Methodology
Develop a more robust schedulabilitytheory
A schedulability theory for aperiodic tasks(fewer rigid assumptions)
Extend the theory to distributed systems
Analyze end-to-end behavior of complex taskgraphs
5/28/2010
39
Methodology
Develop a more robust schedulabilitytheory
A schedulability theory for aperiodic tasks(fewer rigid assumptions)
Extend the theory to distributed systems
Analyze end-to-end behavior of complex taskgraphs
Kirchhoff's laws provide simple composition rules toreduce complex circuits to single components
Composition rules for distributed systems?
Delay? Capacity?
Towards a Theory forTemporal Composition
5/28/2010
40
Analogy with circuitor control theory
Distributed Cyber-PhysicalSystems
Given two points, determine end-to-end schedulability properties- Reduce system to equivalent single block
80
Machine 1 Machine nMachine 2
U1 U2 Un
…U U
U
i i
ii
n ( )1 2
11
1
Task T Tight Bound!
Main Results in Schedulabilityof Multistage Execution
5/28/2010
41
81
Machine 1 Machine nMachine 2
U1 U2 Un
…U U
U
i i
ii
n ( )1 2
11
1
Task T Worst case scenario: cross traffic
What if all tasks follow the same multistage path?
Machine 1 Machine nMachine 2
U1 U2 Un
…
Main Results in Schedulabilityof Multistage Execution
82
Machine 1 Machine nMachine 2
U1 U2 Un
…U U
U
i i
ii
n ( )1 2
11
1
Task T Worst case scenario: cross traffic
What if all tasks follow the same multistage path?
Machine 1 Machine nMachine 2
U1 U2 Un
…Better schedulability !A dversary has fewer degrees of freedom!
Task T andothers
Main Results in Schedulabilityof Multistage Execution
5/28/2010
42
83
A fundamental question: How does delay of a low priority task depend on execution times
of higher priority tasks on each stage?
Machine 1 Machine nMachine 2
U1 U2 Un
…
Task T andothers
Main Results in Schedulabilityof Multistage Execution
84
jobsi
i,1CDelay
stagesj
j1,CDelay
?i j
ji,CDelayIs
jobs stages
J2 J1
J1 J2
J1 J2
J1 J2 J3
Many jobs, one stage
J1
J1
J1time
time
Many stages, one job
Many jobs, many stages
?i j
ji,CDelayIs
jobs stages
S 1
S 1
S 2
S 3
S 1
S 2
S 3
time
Delay Composition in PipelinedExecution
5/28/2010
43
Delay Composition
85
Machine 1 Machine nMachine 2 …
Task T1
Machine 1
Machine 2
Machine 3
Delay Composition
86
Machine 1 Machine nMachine 2 …
Task T1Task T2
Machine 1
Machine 2
Machine 3
5/28/2010
44
Delay Composition
87
Machine 1 Machine nMachine 2 …
Task T1Task T2Task T3
Machine 1
Machine 2
Machine 3
Delay Composition
88
Machine 1 Machine nMachine 2 …
Task T1Task T2Task T3
Machine 1
Machine 2
Machine 3
5/28/2010
45
Delay Composition
89
Machine 1 Machine nMachine 2 …
Task T1Task T2Task T3
Machine 1
Machine 2
Machine 3
EquivalentUniprocessor?
Delay Composition
90
Machine 1 Machine nMachine 2 …
Task T1Task T2Task T3
Machine 1
Machine 2
Machine 3
EquivalentUniprocessor? 2 C max T1 S C max stage j2 C max T2
5/28/2010
46
Proof Outline
Proof by induction on task priority
Prove delay composition theorem for 2 task system; Adda task of higher priority and show that theorem still holds
Two-task system:Sum of J2’s
execution timesSum of J1’s
execution times
No. of terms in delayexpression = No. of tasks
+ (No. of stages – 1)
Intuition: Two Task System (1/2)
C ase 1: J2 arrives together w ith or before J1N+1 stage
computation times;O ne from each stage,
except stage Lwhich has two
5/28/2010
47
Intuition: Two Task System (2/2)
C ase 2: J2 arrives after J1Let
J2 preempt J1 at stage j;stage L be the last stage
where J1 waits for J2
ji,stagesall
ji,stagesall
CmaxCmax2Delay
jobs
priority
higher
jobs
priority
higher
C1,1 C2,1 C3,1 C4,1
C1,2 C2,2 C3,2 C4,2
C1,3 C2,3 C3,3 C4,3
C1,4 C2,4 C3,4 C4,4
Delay Composition in PipelinedExecution
Machine 1 Machine nMachine 2 …
Task Tm
and others
C3,1
C1,1
C4,1
C2,1
C3,2
C1,2
C4,2
C2,2
C3,n
C1,n
C4,n
C2,n
5/28/2010
48
95
Properties
● Worst case delay of a job is sub-additive in the executiontimes of higher priority jobs that share its path
● Results applicable to both periodic and aperiodic tasks
● Results applicable under any scheduling policy where jobprioritization is consistent across stages
Machine 1 Machine nMachine 2
U1 U2 Un
…
ji,stagesall
ji,stagesall
CmaxCmaxDelayjobs
jobs
Task T andothers
96
Properties
An Idea:
● Can we create an artificial uni-processor task set for which TaskT has the same delay as that above?
● Hint: Expressing the computation time of each “uni-processor”task in terms of the pipeline tasks
Reduce pipeline to equivalent uniprocessor!
Machine 1 Machine nMachine 2
U1 U2 Un
…
ji,stagesall
ji,stagesall
CmaxCmaxDelayjobs
jobs
Task T andothers
5/28/2010
49
97
A Recent Result: DelayComposition Algebra
A compositional algebraic framework to analyzetiming issues in distributed systems Operands represent workload on composed sub-
systems
Set of operators to systematically transformdistributed real-time task systems to uniprocessortask-systems
Traditional uniprocessor analysis can be applied toinfer schedulability of distributed tasks
98
Delay Composition Algebra
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
5/28/2010
50
99
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
100
PIPE
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
5/28/2010
51
101
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
102
SPLIT
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
5/28/2010
52
103
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
104
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
5/28/2010
53
105
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
106
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
5/28/2010
54
107
Two operators that apply to node workloads whilesimplifying the resource graph
W = PIPE (W1, W2)
W = SPLIT (W1)
Delay Composition Algebra
108
The Workload Matrix
Each cell i,j holds the delay that job i imposes on job j inthe subsystem that the matrix represents.
Example: System consisting of four jobs (J1-J4) of whichonly J1, J2, and J4 execute on the stage
5/28/2010
56
111
Example
S1
S2
S3
S4 S5
S6
S7 S8
T1: P 1 = D1 = 10 units
T2: P 2 = D2 = 20 units
T3: P 3 = D3 = 20 units
Every job requires one unit at each stage
112
S3
S1
Example contd.
S2
S4 S5
S6
S7 S8
Step 1: A1 PIPE A3 = A3’
PIPE O perator:
S3’
5/28/2010
57
113
S3’’
S2
S3’
Example contd.
S4 S5
S6
S7 S8
Step 2: A2 PIPE A3’ = A3’’
PIPE O perator:
114
S8S7
S3’’
Example contd.
S4 S5
S6
Step 3: A7 PIPE A8 = A7’
PIPE O perator:
S7’
5/28/2010
58
115
S5S4
S7’
S3’’
Example contd.
S6
Step 4: A4 PIPE A5 = A4’
PIPE O perator:
S4’
116
S6
S7’
S3’’
Example contd. Step 5: A6 PIPE A7’ = A6’
PIPE O perator:
S4’
S6’
5/28/2010
59
117
S6’
S3’’
Example contd. Step 6: SPLIT(A3’’) => A31, A32
S4’
SPLIT Operator: - Replicate matrix and zero out columns corresponding tojobs that don’t follow arc in question
- Replace (qi,k, ri,k) w ith (0, qi,k + ri,k), if Ji and Jk followdifferent arcs (in the output matrix containing Jk)
S31
S32
118
S32
S6’
Example contd. Step 7: A31 PIPE A4’ = A4’’
S4’S31
PIPE O perator:
S4’’
5/28/2010
60
119
S32
S6’
Example contd. Step 8: A32 PIPE A6’ = A6’’
PIPE O perator:
S4’’
120
S4’’
S6’
Example contd. Step 9: A4’’ PIPE A6’’ = A final
PIPE O perator:
Sfinal
5/28/2010
61
121
Schedulability test for T3
Create T1* with comp. time 4, (two times the value in
T1’s row and T3’s column), period 10;
T2*: Comp. time 2 units, period 20;
T3*: Comp. time 1+5 = 6 units (adding the stage
additive component), period 20;
Apply uniprocessor test – response time analysis
T3*’s delay computed as 16 units < period; Schedulable
Example contd.
122
Idea: Modify non-acyclic graph into an acyclic graph
CUT operator: ‘cut’ one of the arcs forming the cycle;each job crossing the arc that is cut is replaced by twoindependent jobs one before the cut and one after
CUT only relaxes precedence constraints on arrival timesof jobs, allowing jobs to arrive in a manner that cancause worst case delay – reduces schedulability;transformation errs on the safe side
Handling Non-Acyclic Graphs
5/28/2010
62
123
Conclusions
Presented a delay composition rule for pipelined systems
Leads to a reduction of the pipeline to single-stagesystems
Can now apply well known uniprocessor schedulabilityanalysis to analyze pipelines
Presented Delay Composition Algebra, for composingtogether resource stages in real-time distributed systems,under both preemptive and non-preemptive scheduling
Operators and operands to successively reduce distributedsystem to a single equivalent uniprocessor for the purposeof schedulability analysis
Future work – develop a complete calculus for modelingand analyzing real-time distributed systems
Placement with Previous TheoreticFoundations
Network Calculus• deterministicand stochasticversions• network-traffic oriented• assumptions on traffic arrivals• generally accurate analysis
Real-Time SchedulingTheory• deterministicanalysis• mostly for periodic tasks• hard to analyzeaperiodicdistributed systems• detailed low-levelmodels• accurate analysis
Feasible RegionA naly sis• deterministicanalysis• aperiodic tasks• very simple analysis• high-levelmodels
• only sufficientconditions
Q ueueing Theory• probabilistic analysis• known distributionassumptions• hard to analyze (e.g.,correlated traffic)• accurate analysis
Real-TimeQ ueueing Theory• probabilistic analysis of miss ratio• liquid task model only• accurate analysis
5/28/2010
63
Towards Sufficient Simplicity
Feasible RegionA naly sis• deterministicanalysis• aperiodic tasks• very simple analysis• high-levelmodels
• only sufficientconditions• the safe “quick-and-dirty ” solution
Envisioned analy sisTechniques should be:
C urrent analy sisTechniques are:
exact
complex
resourceoptimal
Less error prone (simplicity ),more scalable (simplicity ),and safe (sufficiency )
References
Delay Composition Algebra
Praveen Jay achandran and Tarek Abdelzaher, “A Delay Composition Theorem forReal-Time Pipelines,” Euromicro C onference on Real-Time Systems, P isa, Italy,July 2007
Praveen Jay achandran and Tarek Abdelzaher “Delay Composition in Preemptiveand Non-preemptive Real-Time Pipelines,” Journal of Real-Time Systems,V olume 40 Number 3, 2008.
Praveen Jay achandran and Tarek Abdelzaher “Transforming Distributed A cyclicSy stems into Equivalent Uniprocessors Under Preemptive and Non-PreemptiveScheduling,” EC RTS Prague, Czech Republic, July 2008.
Praveen Jay achandran and Tarek Abdelzaher “Delay Composition Algebra: AReduction-based Schedulability A lgebra for Distributed Real-Time Systems,” IEEEReal-time Sy stems Symposium, Barcelona, Spain,December 2008.
Praveen Jay achandran and Tarek Abdelzaher, “End-to-End Delay Analysis ofDistributed Sy stems with Cycles in the Task Graph,” Euromicro ConferenceonReal-Time Systems,Dublin, Ireland, July 2009.
5/28/2010
64
Part II: Spatial Interactions(Distributed Sensing Systems)
Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign
Precision Agriculture
Habitat Monitoring
Infrastructure Protection
Disaster Response
Target Tracking
Sensor Networks
Applications
Border Control
Features
A d hoc deploy ment
Massive distribution
Interaction w ith aphy sical environment
Unattended operation
5/28/2010
65
A Fundamental Challenge:Sensor Network ApplicationDevelopment
Cost of sensor network software will dominate totalsystem cost
Example: U. Virginia’s VigilNet Cost of hardware: $10K
(100 nodes * $100/node at scale) Cost of software development/debugging/testing: $120K
(5 graduate students * 20 weeks * 40 hrs * $30/hr) Hardware cost is decreasing but programmers’ cost isn’t
Reducing software cost Development cost: reusable components, high-level
abstractions Debugging cost: automated checking and analysis tools Testing cost: realistic simulation/emulation environments
130
Can youimplement the
protocol onMicaZ?
Sure.It will bedone in 3
days.
Typical Sensor Network Application Developer
5/28/2010
66
131
Can youimplement the
protocol onMicaZ?
Sure.It will bedone in 3
days.
Typical Sensor Network Application Developer
132
Typical Sensor Network Application Developer
Ok!Coding complete
Lets test the
application on 2motes.
5/28/2010
67
133
Typical Sensor Network Application Developer
Hurray!It works for 2
motes!Lights areblinking!
134
Typical Sensor Network Application Developer
Ok! Its time toshow off.
Lets run theapplication on
30 motes.
5/28/2010
68
135
Typical Sensor Network Application Developer
What?What happened?
Everything issooo… dead!
136
Typical Sensor Network Application Developer
Think hard!What couldbe possibly
wrong?
5/28/2010
69
137
Typical Sensor Network Application Developer
What?Still dead?I thought I
fixed allbugs!
138
Typical Sensor Network Application Developer
5/28/2010
70
139
I thought yousaid 3 days!
It workson 3 nodes
!
140
I thought yousaid 3 days!
It workson 3 nodes
!
5/28/2010
71
141
Debugging in Sensor Networks
Many clever debugging techniques have beenproposed to increase visibility into state
They help programmers step through distributedcode, narrow down errors to code snippets (e.g., abad pointer reference), use high-level source, etc.
We would like to complement those with adiagnostic capability
Initial Thoughts
Log state from multiple runs
Label runs as “good” or “bad” dependingon whether errors were manifested
Use a classifier to infer a predicate onstate that distinguishes “good” from “bad”runs
142
5/28/2010
72
143
Case Study: Envirotrack
Envirotrack• Distributed target
tracking protocol• Assigns unique leader• Memberssend data to
leader
Target movement causesleader handoffSometimes protocolresulted in more thanone “detected target” forone physical target.
• Why?
144
Case Study: Envirotrack
Good run = no spurioustargets
Bad run = spurioustargets
Logged state: allmessage headers
5/28/2010
73
145
Case Study: Envirotrack
Primary cause of failure: No member to leadermessages (singleton group) Uncovered bug: Singleton groups are not addressed
Good run = no spurioustargets
Bad run = spurioustargets
Logged state: allmessage headers
146
However… Success was Limited
Cause of failure is often distributed acrossmultiple components rather than local to asingle one
No single component is to blame
No predicate over current state explains failure
Unexpected sequences of events lead to problems
Troubleshooting requires correlation ofdistributed sequences of events withmanifestations of anomalies
5/28/2010
74
147
A “Kitchen” Analogy
Prepare the chicken,onions….
Add salt Marinate thechicken
Delicious chicken
Add seasoning
Put in the oven(3.00 pm)
Its time tostop the oven(4.00 pm)
148
A “Kitchen” Analogy
Prepare the chicken,onions….
Marinate thechicken
Put in the oven(3.00 pm)
Its time tostop the oven(4.00 pm)
Add saltAdd seasoning
Its daylight saving,Time set the clockone hour behind
Who Burnt thechicken?
5/28/2010
75
149
A “Kitchen” Analogy
Prepare the chicken,onions….
Marinate thechicken
Put in the oven(3.00 pm)
Its time tostop the oven(4.00 pm)
Add saltAdd seasoning
Its daylight saving,set the clock one hour
behind
Who Burnt thechicken?
150
A “Kitchen” Analogy
Prepare the chicken,onions….
Marinate thechicken
Put in the oven(3.00 pm)
Its time tostop the oven(4.00 pm)
Add saltAdd seasoning
Its daylight saving,set the clock one hour
behind
Who Burnt thechicken?
Target: Intermittent error manifestations due to interactive complexity
5/28/2010
76
151
Main Idea
Run the system multiple times and log keyruntime events
Sometimes it works, sometimes it fails
Find the frequent sequences of events whenthe system worked
Find the frequent sequences of events whenthe system failed
Contrast the two to identify the “culprit”sequences of events (correlated with failure)
152
How do we find frequent sequences of events?
5/28/2010
77
Main Idea of Sequence Mining
X,A,B,C,A,B,D,A,C,B,D,A,C,B
Logged sequenceof events
Assume “Frequent” means exists 3 times
Main Idea of Sequence Mining
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4
Logged sequenceof events
5/28/2010
78
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
Candidate PatternsOf Length 1
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
Frequent Patterns
L1
Candidate PatternsOf Length 1
5/28/2010
79
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
Candidate patternsOf length 2
ABACBABCCACBCCAABB
L1
L1 x L1
Candidate PatternsOf Length 1
Frequent Patterns
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
Candidate patternsOf length 2
AB=4ACBABCCACBCCAABB
L1
L1 x L1
Candidate PatternsOf Length 1
Frequent Patterns
5/28/2010
80
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
Candidate patternOf length 2
AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2
L1
L1 x L1
Candidate PatternsOf Length 1
Frequent Patterns
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2
L1
L1 x L1
AB=4AC=3BA=3BC=3CB=3
L2
Candidate PatternsOf Length 1
Frequent Patterns
Frequent Patterns
5/28/2010
81
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2
L1
L1 x L1
AB=4AC=3BA=3BC=3CB=3
L2
ABCBAC
L2XL1
Candidate PatternsOf Length 1
Frequent Patterns
Candidate PatternsOf Length 3
Main Idea of Sequence Mining
A=4B=4C=3D=2X=1
X,A,B,C,A,B,D,A,C,B,D,A,C,B
A=4B=4C=3
AB=4AC=3BA=3BC=3CA=2CB=3CC=1AA=2BB=2
L1
L1 x L1
AB=4AC=3BA=3BC=3CB=3
L2
ABCBAC
L2XL1
Candidate PatternsOf Length 1
Frequent Patterns
Candidate PatternsOf Length 3
Empty
Frequent patterns
5/28/2010
82
163
How it Works?
Application
Log collectionfront end
N Log files
Datapreprocessingmiddleware
High level“Bad” behavior
metric
Delay>5ms=> Bad
Good log 1Good log 2Good log K
Bad log 1Bad log 2Bad log L
Frequentsequences
in set ofgood logs
Frequentsequencesin set ofbad logs
Remove common sequences
Frequent sequencegenerationalgorithm
Frequentsequences
only in set ofgood logs
Frequentsequences
only in set ofbad logs
Execute n times
164
Front-End –III:Diagnostic Simulation
Front-End –II:Runtime Logging
Set of Data Collection Front-Ends
Front-End –I:Passive Listener
5/28/2010
83
165
Extension- I:Preventing generation of out of order patterns
in loops
166
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
** CRASH **
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
5/28/2010
84
167
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
…..
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
168
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
…..
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
5/28/2010
85
169
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
…..
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
170
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
…..
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
5/28/2010
86
171
Extension – I contd.
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
App read pkt from buffer,
Write data to kernel buffer,
App Process Pkt,
…..
…..
…..
Good Log Bad Log
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…….
Radio Receive pkt,
Signal new data,
Write data to kernel buffer,
App read pkt from buffer,
App Process Pkt,
…..
…..
…..
172
Extension –II:A subsequence of a good thing is not always
good
5/28/2010
87
173
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Ack Received,Disable Radio
…..
…..
…..
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Disable Radio,
…..
…..
** Crash **
…..
Extension –II contd.
Good Log Bad Log
174
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Ack Received,Disable Radio
…..
…..
…..
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Disable Radio,
…..
…..
…..
Extension –II contd.
Good Log Bad Log
5/28/2010
88
175
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Ack Received,Disable Radio
…..
…..
…..
Enable Radio,Message Sent ,Ack Received,Disable Radio
…….
Enable Radio,Message Sent ,Disable Radio,
…..
…..
…..
Extension –II contd.
Good Log Bad Log
Rule for subsequence elimination:
If a subsequence has same support as a larger sequence,“eliminate” the subsequence
176
Finding infrequent events with frequentside effects?
5/28/2010
89
177
Extension –III:Two stage mining
178
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
Good Log………………………………………………………………
…………………………………………………………..msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,
temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,
msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,
newNeighborFound, msgDropped,
………………………………………………………………………………………………………………………………
………………………………………………………………
5/28/2010
90
179
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
Good Log………………………………………………………………
…………………………………………………………..msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,
temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,
msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,
newNeighborFound, msgDropped,
………………………………………………………………………………………………………………………………
………………………………………………………………
180
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
Good Log………………………………………………………………
…………………………………………………………..msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,
temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,
msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,
newNeighborFound, msgDropped,
………………………………………………………………………………………………………………………………
………………………………………………………………
5/28/2010
91
181
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
Good Log………………………………………………………………
…………………………………………………………..msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,temperatureSensed, flashWriteDone,
temperatureSensed, msgSent,ackReceived, msgSent, ackReceived,
msgReceived,ackSent, flashWriteDone,temperatureSensed, newDataEntered,
newNeighborFound, msgDropped,
………………………………………………………………………………………………………………………………
………………………………………………………………
182
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
5/28/2010
92
183
How Does Two-stage Mining Work?
Bad Log………………………………………………………………
……………………………………………………………….msgSent, ackReceived, msgSent,
ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
flashWriteDone, msgSent, ackReceived,
msgReceived,ackSent, oldDataDeleted,flashWriteDone, temperatureSensed,
msgSent, ackReceived, msgSent,ackReceived, msgReceived,ackSent,flashWriteDone, temperatureSensed,
newDataEntered, newNeighborFound,msgDropped, msgSent, ackReceived,
msgSent, ackReceived,msgReceived,ackSent, flashWriteDone,
temperatureSensed, Reboot,
flashWriteDone, msgReceived,msgDropped, newneighborfound,
msgReceived, msgDropped,msgReceived, msgDropped,msgReceived, msgDropped,
msgReceived, msgDropped,msgReceived, msgDropped,
………………………………………………………………………………………………………………………………
184
Case Study – II: LiteOS Bug
A simple data collectionalgorithm
Implemented on MicaZmote testbed
Failure scenario Some of the nodes would
crash occasionally (athigher loads) and nondeterministically.
Different nodes wouldcrash and at differenttimes.
5/28/2010
93
185
LiteOS Bug – contd.
Recorded Events AttributeList
Higher LevelEvents
Packet_Received Null
Packet_Sent Null
Radio OperationEvents
Get_Current_Radio_Info_Address Null
Get_Current_Radio_Handle_Address Null
Disable_Radio_State Null
Get_Current_Radio_Handle Null
Get_Radio_Mutex Null
Get_Radio_Send_Function Null
Thread OperationRelated Events
Get_Current_Thread_Address Null
Mutex_Unlock_Function Null
Get_Current_Thread_Index Null
Post_Thread_Task Null
Serial OperationRelated Events
Get_Serial_Mutex Null
Get_Current_Serial_Info_Address Null
Get_Serial_Send_Function Null
186
LiteOS Bug – contd.
<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>
<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>
<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>
<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>
<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>
<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>
<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>
<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>
Frequent sequence of events found in Bad Log
Frequent sequence of events found in Good Log
5/28/2010
94
187
LiteOS Bug – contd.
<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>
<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>
<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>
<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>
<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>
<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>
<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>
<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>
Frequent sequence of events found in Bad Log
Frequent sequence of events found in Good Log
188
LiteOS Bug – contd.
<Context_Switch_to_User_Thread> ,<Get_ Current_Thread_ Address> ,<Ge t_Serial_Send_ Func tion>
<Packet_Received>,<Context_Switch_to_User_Thread>,<Get_Serial_Send_Function>
<Packet_Received>,<Post_Thread_Task>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Index>,<Get_Serial_Send_Function>
<Packet_Received>,<Get_Current_Thread_Address>,<Get_Serial_Send_Function>
<Packet_Received>,<Packet_Sent >,<Get_Current_Radio_Handle>
<Packet_Received>,<Get_Current_Radio_Handle_Address>,<Get_Current_Radio_Handle>
<Packet_Received>,<Mutex_Unlock_Function>,<Get_Current_Radio_Handle>
<Packet_Received>,<Disabale_Radio_State>,<Get_Current_Radio_Handle>
<Packet_Received>,<Post_Thread_Task>,<Get_Current_Radio_Handle>
Frequent sequence of events found in Bad Log
Frequent sequence of events found in Good Log
Missing
<Get_Current_Radio_Handle>
Event highlights
the missed registration process
5/28/2010
95
189
Case Study – III:
Performance problem with multi-channelMAC protocol
190
MAC Protocol Overview
A Multichannel MAC that
clusters tightly communicating
nodes together and assigns
different clusters separate
home channels to reduceinterference
Nodes can communicate across
clusters by temporarily using
the receiver’s home channel
If receiver not found sender
scans channels
Acks sent from home channel.
Acks overheard to update
home channel of originator
5/28/2010
96
191
Recorded Events
Recorded Events Attribute List
Ack_Received Null
Home_Channel_Changed oldChannel, newChannel
TimeSyncMsg referenceTime, localTime
Channel_Update_Msg_Sent homeChannel
Data_Msg_Sent_On_Same_Channel destId, homeChannel
Data_Msg_Sent_On_Different_Channel destId, homeChannel, destChannel
Channel_Update_Msg_Received homeChannel, neighborId, neighborChannel
Try_Next_Channel oldChannelTried, nextChannelToTry
No_Ack_Received Null
192
Extracted Sequence of Events
<No_Ack_Received>,< Try_Next_Channel>
< Try_Next_Channel>,<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,
<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : nextchanneltotry:2>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,
<No_Ack_Received>,
< Try_Next_Channel : oldchanneltried:3>
5/28/2010
97
193
Extracted Sequence of Events
<No_Ack_Received>,< Try_Next_Channel>
< Try_Next_Channel>,<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,
<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : nextchanneltotry:2>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,
<No_Ack_Received>,
< Try_Next_Channel : oldchanneltried:3>
194
Extracted Sequence of Events
<No_Ack_Received>,< Try_Next_Channel>
< Try_Next_Channel>,<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel >,< Try_Next_Channel : oldchanneltried:1>,
<No_Ack_Received>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : nextchanneltotry:2>
<Data_Msg_Sent_On_Same_Channel: homechannel:0>,<No_Ack_Received>,
< Try_Next_Channel >,< Try_Next_Channel : nextchanneltotry:1>,
< Try_Next_Channel : oldchanneltried:2>,< Try_Next_Channel : nextchanneltotry:3>,
<No_Ack_Received>,
< Try_Next_Channel : oldchanneltried:3>
5/28/2010
98
195
MAC Protocol Overview - Revisited
A Multichannel MAC that
clusters tightly communicating
nodes together and assigns
different clusters separate
home channels to reduceinterference
Nodes can communicate across
clusters by temporarily using
the receiver’s home channel
If receiver not found sender
scans channels
Acks sent from home channel.
Acks overheard to update
home channel of originator
196
MAC Protocol Overview - Revisited
A Multichannel MAC that
clusters tightly communicating
nodes together and assigns
different clusters separate
home channels to reduceinterference
Nodes can communicate across
clusters by temporarily using
the receiver’s home channel
If receiver not found sender
scans channels
Acks sent from home channel
Acks overheard to update
home channel of originator
5/28/2010
99
197
Performance Improvement
0
50000
100000
150000
200000
250000
300000
350000
Successful Send Successful Receive
Num
bero
fM
ess
ages Multichannel
MAC performance(with bug)
MultichannelMAC performance
(with bug fix)
Performance improved by upto 50%
Quick fix: Keep ACK enabled all the time
Next: Symbolic Debugging
Finding symbolicpatterns in raw eventsequences
5/28/2010
101
Sensor Aggregator
Forwarder
12
18
9
Msg X
Msg Y
Motivating Example
Forwarder15
Sensor Aggregator
38
Msg X
Msg Y
Motivating Example
5/28/2010
102
Pattern 1(support=1)1. Sensor 3 sends Msg X to Aggregator 82. Aggregator 8 sends Msg Y to Forwarder 15
Pattern 2(support=1)1. Sensor 2 sends Msg X to Aggregator 52. Aggregator 5 sends Msg Y to Forwarder 8
Pattern 3(support=1)1. Sensor 12 sends Msg X to Aggregator 92. Aggregator 9 sends Msg Y to Forwarder 18
Motivating Example
Pattern 1(support=1)1. Sensor 3(A) sends Msg X to Aggregator 8(B)2. Aggregator 8(B) sends Msg Y to Forwarder 15(C)
Pattern 2(support=1)1. Sensor 2(A) sends Msg X to Aggregator 5(B)2. Aggregator 5(B) sends Msg Y to Forwarder 8(C)
Pattern 3(support=1)1. Sensor 12(A) sends Msg X to Aggregator 9(B)2. Aggregator 9(B) sends Msg Y to Forwarder 18(C)
Symbolic Pattern(support=3)1. Sensor A sends Msg X to Aggregator B2. Aggregator B sends Msg Y to Forwarder C
Motivating Example
5/28/2010
103
Another Example
Pattern 1
< msgSent, senderId = 1,msgType = 0 >,
< msgReceived, receiverId = 2,msgType = 0 >
Symbolic pattern 1
< msgSent, senderId = X,msgType = 0 >,
< msgReceived, receiverId = neighbor(X),msgType = 0 >
Pattern 2
< msgSent, senderId = 3,msgType = 0 >,
< msgReceived, receiverId = 5,msgType = 0 >
Symbolic pattern 2
< msgSent, senderId = X,msgType = 0 >,
< msgReceived, receiverId = neighbor(X),msgType = 0 >
How to Generate Symbolic Patterns?
<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>
1. Initially, log multi-attribute events
5/28/2010
104
How to Generate Symbolic Pattern?
<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>
<msgSent, msgType=X>,<msgReceived, msgType=X>
2. Find frequent single attribute patterns
1. Initially, log multi-attribute events
How to Generate Symbolic Pattern?
<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>
<msgSent, msgType=X>,<msgReceived, msgType=X>
2. Find frequent single attribute patterns
1. Initially, log multi-attribute events
<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>
3. Reconstruct multi-attribute events
5/28/2010
105
How to Generate Symbolic Pattern?
<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>
<msgSent, msgType=X>,<msgReceived, msgType=X>
2. Find frequent single attribute patterns
1. Initially, log multi-attribute events
<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>
3. Reconstruct multi-attribute events
<msgSent, nodeId=A, msgType=X, TimeStamp=*>,<msgReceived, nodeId=Relation(A), msgType=X, TimeStamp=*>
4. Generate symbolic patterns (Relation: Neighbor, Hop-distance etc.).
How to Generate Symbolic Pattern?
<msgSent, nodeId, msgType, TimeStamp>,<msgReceived, nodeId, msgType, TimeStamp>
<msgSent, msgType=X>,<msgReceived, msgType=X>
2. Find frequent single attribute patterns
1. Initially, log multi-attribute events
<msgSent, nodeId=*, msgType=X, TimeStamp=*>,<msgReceived, nodeId=*, msgType=X, TimeStamp=*>
3. Reconstruct multi-attribute events
<msgSent, nodeId=A, msgType=X, TimeStamp=*>,<msgReceived, nodeId=Relation(A), msgType=X, TimeStamp=*>
4. Generate symbolic patterns (Relation: Neighbor, Hop-distance etc.).
5. a. Calculate support for candidate symbolic pattern.b. Replace original pattern with symbolic pattern if support is “similar”.
5/28/2010
106
Case Study IV
Directed Diffusion
Observed problem Occasionally base station stops receiving packets after a
while although source node was generating data
5/28/2010
107
Directed Diffusion Protocol Overview
1 2 3
Interest Id X:Forward data
to : None
Node 1Interest cache
Interest Id X:Forward data
to : 1
Node 2Interest cache
Interest Id X:Forward data
to : 2
Node 3Interest cache
Interest cache keeps track of which way to forwardreceived data
Data cache keeps track of last received packet
Received data are dropped if no matching interest is foundfor the received data
Gradient Gradient
Interest X Interest X
5/28/2010
108
215
Conclusions
We need to analyze distributed sequences ofevents to identify causes of anomaly
Discriminative sequence mining is a promisingchoice for debugging interactive bugs and cornercase protocol bugs
Easy to use
References
C lassification-based diagnostics:
Maifi Khan, Tarek Abdelzaher, Liqian Luo, "SNTS: Sensor NetworkT roubleshooting Suite," DCoSS, Santa Fe, New Mexico, June2007.
Sequence-mining-based diagnostics:
Mohammad Khan, Tarek Abdelzaher and Kamal Gupta, "TowardsDiagnostic Simulation in Sensor Networks," DCoSS Santorini,Greece, June 2008.
Mohammad Khan, H ieu Khac Le, Hossein Ahmadi, TarekAbdelzaher, Jiawei Han ``DustMiner: T roubleshooting InteractiveC omplexity Bugs in Sensor Networks," Sensys, Raleigh, NC,November 2008.
Symbolic mining:
Mohammad Khan, Tarek Abdelzaher, Jiawei Han and HosseinAhmadi, "Finding Symbolic BugPatterns in Sensor Networks,"International Conference on Dis tributed Computing in SensorSys tems (DCOSS) Marina Del Rey, C A, June 2009
216
5/28/2010
109
Part III: Social Interactions(Human-centric CPS)
Tarek AbdelzaherUniversity of Illinois at Urbana-Champaign
A Future Cyber-Physical Systemmight look something like this …
5/28/2010
110
A Future Cyber-Physical Systemmight look something like this …
Human-centric CPSEmpirical Evidence
http://www.sensatex.com
Nike -iPod
Wii
Spot
The mastercontroller
Technology introduced by 2007
5/28/2010
111
Business Case?Health and Wellness
http://www.sensatex.com
Nike -iPod
Wii
Spot
The mastercontroller
HealthVault
Business Case?Sports and Entertainment
http://www.sensatex.com
Nike -iPod
Wii
Spot
The mastercontroller
5/28/2010
112
Business Case?Multiplayer Games
http://www.sensatex.com
Nike -iPod
Wii
Spot
The mastercontroller
Community Sensing ApplicationsCarTel (Sam Madden)
Reprinted from http://cartel.csail.mit.edu/overview. html
• An ad hoc network ofvehicles with sensors• Can measure roadcongestion• Can generate maps ofroad conditions, etc.• Given uncontrolledmobility model, how toperform global dataoperations (querydissemination, collection,etc)?
5/28/2010
113
Emergence of a Web ofSensors
WWW a gathering place around topic of mutualinterest
Future Sensor Web a gathering place aroundmutually interesting data pools (and derived info) Feng Zhao: MSR Sensor Map
Dave Clark: The future Internet will link more sensorsand embedded devices that traditional hosts
Van Jacobson: Named-data networking paradigm (weuse the Internet as an information source not acommunication medium)
Community Sensing
Individuals withsensing devicescollect sensor data
Shared data usedtowards a commongoal
Internet
5/28/2010
114
Community Sensing
Individuals withsensing devicescollect sensor data
Shared data usedtowards a commongoal
Internet
Community Sensing
Individuals withsensing devicescollect sensor data
Shared data usedtowards a commongoal
Internet
5/28/2010
115
Enabling Community Sensing
The privacy dilemma: Individuals do not want to share private data
(e.g., GPS trajectories)
Useful applications can be built if enough datais shared (e.g., real-time traffic maps)
Enabling Community Sensing
The privacy dilemma: Individuals do not want to share private data
(e.g., GPS trajectories)
Useful applications can be built if enough datais shared (e.g., real-time traffic maps)
Two types of solutions: Anonymity: reveal data, hide owner
Data alteration: reveal owner, hide data
5/28/2010
116
Data Sharing:A Privacy Problem Formulation
Users have time-series sensor data Examples – weight of an user on diet measured once a day, speed
of an user measured periodically on a given route
Goal : Let users to “lie” about their data, yet allowcomputing accurate distribution over the community at anypointUser 1
User 2
User 3
User N
….
An Example
Dieters want to share weight information to findefficacy of the given diet, without revealing theirtrue weight, average, trend (loss or gain ofweight), etc…
5/28/2010
117
Perturb data? Add Noise?
Weight curve perturbed by addingindependent random noise
Estimation using PCA to breachprivacy of user
Add Noise and RandomOffset?
234
Weight curve perturbed by addingindependent random noiseand a random offset
Estimation using PCA to estimate thedata of the user
5/28/2010
118
Challenge
Develop perturbation that preserves privacyof individuals
Cannot infer individuals’ data without large error
Reconstruction of community distribution can beachieved within proven accuracy bounds
Perturbation can be applied by non-expert users
Intuitive Approach
Add virtualuser curve
to realcurve
Real user
V irtual user
Perturbed data curve
5/28/2010
119
Traffic Analyzer
Users share perturbedspeed data withaggregation server
Server combines perturbedspeed data and uses de-convolution with noisemodel to compute originalspeed distribution
Garmin GPS used for datacollection
Results are from real datacollection in Urbana-Champaign in 2008
Dept. ofC omputerScience
Roads for which we wantto estimate average speed
Perturbing Speed
5/28/2010
120
Reconstruction of AverageSpeed
Reconstruction of CommunitySpeed Distribution
Real community distribution ofspeed
Reconstructed community distributionof speed
5/28/2010
121
Perturbing Speed and Location
Clients lie about both location and speed
Reconstruction Accuracy
Real versus reconstructed speed
Real community distribution ofspeed
Reconstructed community distributionof speed
5/28/2010
122
More on ReconstructionAccuracy
Real versus reconstructed speed onWashington St., Champaign
Real community distribution ofspeed
Reconstructed community distributionof speed
How Many are Speeding?
Street Real %Speeding
Estimated %Speeding
University Ave 15.6% 17.8%
Neil Street 21.4% 23.7%
Washington Street 0.5% 0.15%
Elm Street 6.9% 8.6%
Real versus estimated percentage of speedingvehicles on different streets (from data of userswho “lie” about both speed and location)
5/28/2010
123
Privacy and OptimalPerturbation
245
Is there an optimal perturbation scheme?
What is the measure the privacy?
How can we generate the optimalperturbation?
Privacy Measure
We use the mutual information I(X;Y) to measurethe information about X contained in Y
Minimal information leak under noise powerconstraint
5/28/2010
124
Upper Bound on Privacy
Lemma (Ihara, 78)
The noise that minimizes the upperbound on information leak is a Gaussiannoise
C ovariance of signal
C ovariance of noise
Mutual Information (Leak)
Finding the Optimal Noise
248
Solving for the optimal noise’s covariancematrix
5/28/2010
125
Optimal Noise
The noise generation method can be seen as theoptimal allocation of noise energy in thefrequency domain
Utility vs. Privacy Trade-off
250
5/28/2010
126
References
Privacy for single data streams:
Raghu Ganti, Nam Pham, Yu-En Tsai, Tarek Abdelzaher “PoolView: StreamPrivacy for Grassroots Participatory Sensing,” Sensys, Raleigh, NC, November2008.
Privacy for multidimensional data:
Nam Pham, Raghu Ganti, Md. Yusuf Uddin, Suman Nath, Tarek Abdelzaher,“Privacy -Preserving Reconstruction of Multidimensional Data Maps in V ehicularParticipatory Sensing,” European Conference on Wireless Sensor Networks(EWSN), C oimbra,Portugal, February, 2010.
O ptimality and bounds on privacy :
Nam Pham, Tarek A bdelzaher, Suman Nath, “On Bounding Data Stream Privacyin Distributed C y ber-phy sical Systems,” IEEE International Conference onSensor Networks, Ubiquitous, and Trustworthy Computing (IEEE SUTC)Newport Beach, CA, June, 2010. (Invited)
Community Sensing
Individuals withsensing devicescollect sensor data
Shared data usedtowards a commongoal
Internet
5/28/2010
127
Community Sensing
Individuals withsensing devicescollect sensor data
Shared data usedtowards a commongoal
Internet
Information Extraction
Two types of community sensing: Sensing to compute aggregate statistics (e.g.,
traffic speed)
Results pertain only to the locales where data wascollected
Example: average speed on one street does notnecessarily help predict speed on another street
Sensing to compute generalizable models thatcan be used for prediction Results from sparsely sampled data apply to a
broader context
5/28/2010
128
The Participant Data ModelingChallenge
A phenomenon is sampled by participantsin spatial and temporal dimensions
Sampling is sparse (at least in conditions ofpartial adoption)
The phenomenon is high-dimensional
Question: how to generalize modelsobtained from the limited samples to coverthe high-dimensional phenomenon space?
Example: Fuel EfficientRouting with GreenGPS
Indiv iduals contribute fuel consumptionvalues of their cars on various streets atdifferent times of the day (sampling of ahigh-dimensional phenomenon)
Not too many cars are doing this (sparsesampling, partial deployment)
Fuel consumption depends on attributesof cars, streets, drivers, etc (high-dimensional phenomenon)
Service computes general model forpredicting the most fuel efficient routefor an arbitrary vehicle betweenarbitrary source and destination point
Saves 6% over shortest path
Saves 13% over fastest path
5/28/2010
129
Preliminary ParticipatorySensing Deployment
DashDyno – OBD scannerwith GPS to collect locationtagged car sensor data
16 different compact andmid-sized sedans (e.g., Ford,Toyota, Honda)
Over 1000 miles of datacollected
Users record sensor data andGPS on SD card and uploadto our service
DashDy no
C overage map
257
Sampling Regression ModelingFramework
Fuel consumption of 16cars driven on a few roads
P redict fuel consumption ofany car on any road
5/28/2010
130
Fuel Consumption Model
Simple model for fuel consumption derived fromphysics principles
Approximate based on easily measurableparameters (e.g. stop signs, speed limits)
Generalization and Modeling
Regression modeling:
Problem: one size does not fit all. Who says that Fords andToyotas have the same regression model?
Regression model per car? Problem: Cannot use data collected by some cars to
predict fuel consumption of others.
Challenge: Must jointly determine both (i) regressionmodels and (ii) their scope of applicability, to coverthe whole data space with acceptable modelingerror.
5/28/2010
131
Generalization Hierarchy
Generalize data by some common attribute(s)?
Example of Regression Cubes
Goal: predict fuel consumption
Group by make, model, or year
5/28/2010
132
Regression Cubes
Data cells correspond to: Output attributes Yc = {yi}
Each associated with k input attributes xi1, … , xik , Xc={xij}
Data cells store the following measures: Regression coefficients:
Regression modeling error:
The Challenge of RegressionCubes
Main challenge: compute cuboid measures,the model and error, recursively (withoutreprocessing raw data)
Model parameters and estimation error atcell c
Not distributive
5/28/2010
133
Compressed Representation
Compressed representation of a cell c: : scalar value
: vector of size k
: k by k matrix
nc : number of samples
These matrices are distributivemeasures
Compressible Measures
Model coefficients:
Error:
5/28/2010
134
Example of Regression Cubes
Goal: predict fuel consumption
Group by make, model, or year
Model and modelingerror are efficientlycomputed for eachpossible generalization.
Model Reduction
Independently find a subset of attributes for each cell, such that: The cell is reliable
Corresponding error is minimized
Exponential number of possible subsets
Our heuristic:
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
L = {v}L = {m}L = {A }L = {S}
A ttr ibutes
0.0310.1520.0430.056
y esy esy esy es
Error Reliable
5/28/2010
135
Model Reduction
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
L = {v}L = {m}L = {A }L = {S}
A ttr ibutes
0.0310.1520.0430.056
y esy esy esy es
Error Reliable
Model Reduction
L = {v, m}L = {v, A }L = {v, S}
0.0210.0300.028
noy esy es
Error Reliable
L = {v}L = {m}L = {A }L = {S}
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
A ttr ibutes
5/28/2010
136
Model Reduction
L = {v, m}L = {v, A }L = {v, S}
0.0210.0300.028
noy esy es
Error Reliable
L = {v}L = {m}L = {A }L = {S}
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
A ttr ibutes
Model Reduction
L = {v, m}L = {v, A }L = {v, S}
L = {v}L = {m}L = {A }L = {S}
L = {v, S, m}L = {v, S, A }
0.0240.026
nono
Error Reliable
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
A ttr ibutes
5/28/2010
137
Model Reduction
L = {v, m}L = {v, A }L = {v, S}
L = {v}L = {m}L = {A }L = {S}
L = {v, S, m}L = {v, S, A }
Reduced Model: {v, S}
V elocity (v)Mass (m)
F rontal area (A)Stop signs (S)
A ttr ibutes
Accuracy Results
The sampling regression cube improves predictionaccuracy significantly
Sparse samplingchallenge: Aregression cubewithout modelreduction is worsethan a single “one-size fits-all” model!
5/28/2010
138
Model Performance
All driven paths are splitinto smaller segments tocapture variations in fuelconsumption on individualstreets
Segment 1 Segment 2 Segment 3
Long Path Error
Reduction in cumulative error with increasing path length
5/28/2010
139
Fuel Savings Evaluation
Experiment: Given shortest and fastest routes, GreenGPS predict best route.
Driver drives both routes repeatedly and compares average fuelconsumption of the two.
Car Details Landmarks Route Savings %
Honda A ccord2001
H1 to Mall Shortest 31.4
H1 to Gy m Shortest 19.7
Ford Taurus 2001 H2 to Restaurant Shortest 26
Toy ota Celica 2001 H2 to Work Fastest 10.1
Nissan Sentra2009
H3 to C UPHD Fastest 8.4
Honda C ivic 2002 Grad to Work Fastest 18.7
Conclusions
An emerging sensor network paradigm is that wheresensing is done by volunteers interested in the datato provide a service
The talked covered sample challenges… Privacy: “Interested in the service but cannot share my
data”
Sparse sampling: In conditions of sparse deployment, mustgeneralize to cover high-dimensional spaces
Interdisciplinary solutions are needed frominformation theory and data mining
5/28/2010
140
More to Do…
Many future challenges remain Architecture and foundations for networks as information sources as
opposed to networks as data carriers
Understanding the interplay between social, physical and informationnetworks
Closing the loop and the human factor: how to account for humans inthe loop (e.g., humans responding to GreenGPS instructions)
More privacy: what can and cannot be inferred from my data?
Usability and trust:
what technical features inspire human trust in the data collection sy stem?
How to ensure trustworthiness of data collection results
Provenance: Where did the data come from?
A repository of data sets?
References
Raghu Ganti, Nam Pham, Hossein Ahmadi, Saurabh Nangia,Tarek Abdelzaher, “GreenGPS: A Participatory Sensing Fuel-Efficient Maps Application,” Mobisys, San Francisco, CA,June 2010.
5/28/2010
141
Stability Challenges
Additional Optional material
Why Does Software HaveDynamics?
Output of component A depends on past output of B(delay)
Output of component A depends on integral of B(e.g., queue level is an integral of difference inrequest rates)
Filters (averaging, moving average)
Sampling
5/28/2010
142
Composition of Self-regulatingComponents
Challenge:
Larger systems composed of a larger number ofcomponents
Modularity and separation of concerns components aredesigned independently
Autonomy self-regulating behavior in many components
Composition of self-regulating components can haveunexpected adverse interactions
How to build well-behaved systems from large numbers ofself-regulating components while designing in a modularfashion?
Composition of Self-regulatingComponents
+ +
+
_ _
_
Subsystem I Subsystem II
Subsystem III
5/28/2010
143
Composition of Self-regulatingComponents
A B
BA
C C
+ +
+
_ _
_
Subsystem I Subsystem II
Subsystem III
Composition of Self-regulatingComponents
A B
C
+ +
+
_ _
_
Subsystem I Subsystem II
Subsystem III
5/28/2010
144
Composition of Self-regulatingComponents
A B
C
+ +
+
_ _
_
Subsystem I Subsystem II
Subsystem III
Example: Energy Managementin Data Centers
Energy expended on:
Computing (powering up racks of machines)
Sensors: Machine utilization, Delay,Throughput, …
Actuators: DVS, turning machines On/Off
Cooling
Sensors: Temperature, air flow, …
Actuators: Air-conditioning units, fans, …
Energy bill is 40-50% of total profit
5/28/2010
145
Energy control Cooling energy optimization and computing
energy optimization is done separately
Challenge: Coordination/decoupling of multiple “control
loops”
Uncoordinated manipulation of multiple knobscan lead to instability or poor efficiency
Example: Energy Managementin Data Centers
ExampleDVS + On/Off
DVS alone
On/Offalone
Optimaljointpolicy