Power Aware Distributed Systems
PAC/C PI MeetingNovember 1 - 3, 2000
USC Information Sciences InstituteBrian Schott, Bob Parker
Rockwell Science CenterCharles Chien
UCLAMani Srivastava
University of California, IrvineRajesh Gupta
Power Aware Distributed Systems
Impact Power-aware algorithms, sensor node RTOS,
and middleware will reduce sensor network aggregate energy requirements >1000X.
This capability will extend sensor network power dynamic range to span from prolonged (months) quiescent operation to “get me the information now at any cost”.
Power instrumentation of existing low-power sensor node provides baseline by which PAC/C tools and technology will be measured.
Goals Algorithms. Develop power-aware algorithms for
cooperative signal processing that exploit sensor data locality, multi-resolution processing, sensor fusion, and accumulated intelligence.
Protocols. Design a distributed sensor network control middleware for power-aware (P-A) task distribution and hardware/software resource utilization migration.
Compilers/OS. Create sensor node RTOS to manage key resources – processor, radio, sensors.
Systems. Identify hardware power control knobs and readable parameters and make them available to the sensor node power-aware RTOS.
Milestones [FY/Q] P-A RTOS scheduling on research platform [01/Q1]. Instrumentation board for research platform [01/Q1]. Compressed image transmission (Laplacian Pyramid) [01/Q1]. SensorSim simulation tool with P-A extensions [01/Q4]. Tool for power-aware RTOS kernel synthesis [02/Q4]. Deployable platform with P-A control “knobs” [02/Q4]. P-A network resource allocation DP field demo [03/Q2]. RP w/ sensor-triggered activation & low power sleep [03/Q3]. High-res multi-look image classification demo [03/Q4].
Extending dynamic power range for distributed sensor
networks.
Sensor Node Hardware Control Knobs and Power
Aware RTOS
Cooperative Signal Processing
Sensor Network Middleware
Sensor Network Baseline
Instrument a state-of-the-art sensor node to understand and baseline power consumption in current sensor systems.
Rockwell WINS is modular: Power Board StrongARM Board Radio Board Sensor Board
WINS representative of other sensor nodes in the community.
We plan to adapt this node to allow module-level power instrumentation and logging both in the lab and in the field.
Power Instrumentation
Insert a power isolation board between each module.
Signals are passed through, power supplies are isolated.
Microcontroller provides power monitoring and power control from a host’s serial port (workstation, laptop or iPAQ).
Event “snooping” may be possible to trigger data acquisition in the field.
PADS Power Isolator
StrongARM
PADS Power Isolator
Radio
PADS Power Isolator
Sensor
Battery Pack
Application Scenario
SensIT Application Scenario: vehicle detection / tracking using acoustic, seismic, and I/R sensors. SensIT metrics include latency, accuracy, false alarm rate, etc. PAC/C adds sensor net lifetime, coverage area, and energy.
SensIT Experiments
SensIT SITEX00 experiment completed in August 2000 at Twenty-nine Palms, CA.
RSC/UCLA/ISI/VT experimented with sensor net GUIs and sensor network coverage algorithms.
The next opportunity is SensIT experiment in March, 2001. PADS plans to power-instrument
some WINS nodes as a secondary experiment at this exercise.
Use our power data and BBN ground truth to define baseline.
PADS will use SensIT and other Rockwell exercises to field power-instrumented baseline and test best-of-breed PAC/C techniques and technology.
http://www.dsic-web.net/ito/meetings/sensit2000oct/presentations/SITEX2000Review.pdf
PADS Research Platform
Identify hardware knobs that can be provided by modules (radio and processor systems) that can be altered dynamically,
Identify externally readable parameters (power, BER, signal strength, battery, etc.) that can be provided to a power-aware runtime system.
Simplify integration of advanced PAC/C technology into an open sensor network platform and evaluate this technology against measured baseline.
Examine system-level aspects of existing sensor nodes. Note that most nodes are CPU-
centric in that the radio, GPS, and sensors are wired up to serial ports or system bus, of an embedded processor.
The dilemma is that the CPU must wake up on any event in the sensor node.
Is there another approach which allows most of the sensor node to be turned off most of the time?
Distributed Sensor Node Approach
Make each module an independent actor on a multi-master serial bus such as I2C (400Kb, 4Mb*). 87C554 Microcontroller - 16 mA Active, 4 mA Idle, 50 uA Shutdown.
Create common command set for peer to peer communication and control of modules.
Localize specific processing as close to modules as possible (perform energy threshold on seismic board, etc.).
A StrongARM may be used for application control and data processing, but could distribute “event handlers” to local microcontrollers and power down most of the time.
I2C + Power
Research Platform Technology Integration / Emulation
Distributed node architecture makes it much easier to integrate PAC/C modules that don’t fit. Most existing sensor modules and
systems have an serial port. Form factor not an issue for
initial laboratory experiments.
Enables simple module emulation and module testing from a workstation or laptop.
Power control and power monitoring can be incorporated into bridge board. Basically the same design as the
WINS power isolator boards!
I2C
SerialPort
Bridge
SerialPort
Bridge
ExperimentalV-scaling
StrongARMBoard
SerialPort
Bridge
EmulatedSensor
WINS Node
Power Analysis of RockwellWINS Nodes (Measurements)
Processor Seismic Sensor Radio Power (mW)Active On Rx 751.6Active On Idle 727.5Active On Sleep 416.3Active On Removed 383.3Active Removed Removed 360.0Active On Tx (36.3 mW) 1080.5
Tx (27.5 mW) 1033.3Tx (19.1 mW) 986.0Tx (13.8 mW) 942.6Tx (10.0 mW) 910.9Tx (3.47 mW) 815.5Tx (2.51 mW) 807.5Tx (1.78 mW) 799.5Tx (1.32 mW) 791.5Tx (0.955 mW) 787.5Tx (0.437 mW) 775.5Tx (0.302 mW) 773.9Tx (0.229 mW) 772.7Tx (0.158 mW) 771.5Tx (0.117 mW) 771.1
Summary
Processor = 360 mW doing repeated
transmit/receive
Sensor = 23 mW
Processor : Tx = 1 : 2
Processor : Rx = 1 : 1
Total Tx : Rx = 4 : 3 at maximum range
Power-aware Multihop Packet Forwarding Architecture
Problem: radio often simply relays packets in multihop network
Traditional approach: main CPU woken up, packets sent to it across serial bus power hungry computing and communication operations
Our approach: exploit programmable micro-controller in the Communication Subsystem to handle common cases of packet routing can also do operations such as combining of packets with redundant information
Key challenge: how to do it so that every new routing protocol will not require a new radio firmware
Solution: application-defined all-layer packet routing
CommunicationSubsystem
RadioModem
GPS
MicroController
Rest of the Node
CPU Sensor
MultihopPacket Communication
Subsystem
RadioModem
GPS
MicroController
Rest of the Node
CPU Sensor
MultihopPacket
…zZZ
Traditional Approach Our Approach
Application-defined All-layer Packet Routing
Packet-classifier and packet-modifier driven by application defined matching rules and actions Matching rules: and/or expressions using =, <, >, range operators on arbitrary packet
fields (offset, length) Actions: accept, forward, drop, field increment/decrement etc.
Rules and actions operate on arbitrary packet fields (any layer) fields specified as (offset, length) only simple, common cases handled at the radio
for complex cases packet sent to the main processor
Expressiveness: implemented the following as test cases Node ID-based addressing and routing (IP-like) Point-cast (send to a circular area specified as destination)
Current proof-of-concept prototype being done on Rockwell node
CommunicationSubsystem
RadioModem
GPS
MicroController
Packet Classifier
Packet Modifier
Application-DefinedMatching Rules
& Actions
Power-aware RTOS Scheduling Under Deadline Constraints
Consider task set (period, WCET, deadline) {(10, 3, 10), (14, 7, 14)}
CPU utilization = 3/10 + 7/14 = 80%
Obvious power management strategies:Shutdown when idle
saves 20% powerCan we slow CPU by 20% (& reduce V) for more savings?
NO, as deadlines will no longer be metHowever, can slow by x 14/13 and lower voltage to still
meet deadlines, and shutdown during idle time saves 22.5% in power
Problem: current approaches use WCET (worst case execution time), and aim at not missing any deadline
Reality #1: Significant Variation in Execution Times
WCET : BCET is typically >> 1, e.g.:
Program Description BCET WCET WCET/BCET
Circle Circle drawing 431 15,958 37
DES Data Encryption 73,912 672,298 9.1
DJPEG JPEG decompression 128x96 color 12,703,432 122,838,368 9.7
FDCT JPEG forward DCT 5,587 16,693 3
FFT 1024-point FFT 1,589,026 3,974,624 2.5
Matcnt Summation of 2 100x100 matrices 1,722,105 8,172,149 4.7
Piksrt Insertion sort of 10 elements 236 5,862 24.8
Sort Bubble sort of 500 elements 13,965 50,244,928 3598
Stats Sum, mean, var of 2 1000-size arrays 1,007,815 2,951,746 2.9
But, execution time variations in sensor data are not random
temporal correlation in underlying physical signal
can attempt to predict!
Reality #2: Sensor Applications Tolerant to Deadline Misses
Computation deadline misses lead to data loss Packet loss common in wireless links
e.g. a wireless link of 1E-4 BER means packet loss rate of 4% for small 50 byte packets
radio links in sensor networks often worse
Significant probability of error in sensor signalsnoisy sensor channels
Applications designed to tolerate noisy/bad data by exploiting spatio-temporal redundancyhigh transient losses acceptable if localized in time or space
If the communication is noisy, and applicationsare loss tolerant, is it worthwhile to strive
for perfect noise-free computing?
Exploiting Execution-time Variation and Tolerance to Deadlines
Our strategy: predict execution time of task instance and dynamically scale voltage even more aggressively so as to minimize shutdown
Execution time prediction learn distribution of execution times (pdf) Tasks with distinct modes can help the OS by providing hint after starting
E.g. MPEG decode can tell the OS after learning whether the frame is P, I, or F
But, some deadlines are missed!
Adaptive control loop to keep deadlines missed under control
Typical result: 1.5-3x higher power saving compared to best conventional schemes with dynamic voltage, with < 1% deadlines missed
Provides adaptive power-fidelity trade-off
Power-aware RTOS Scheduler Implementation
RTOS predicts the remaining runtime (at max CPU speed) of a task instancecalculated whenever the task instance enters the system, or is
preemptedbased on run-times of previous instances of the task, and the run-
time consumed so far e.g. weighted mean e.g. a coarse-grained discrete probability distribution of actual run time of
each task is calculated, and used to calculate E[remaining_runtime | runtime_so_far]
adaptively adjusts a multiplicative factor dependent on recent deadline misses
Voltage scheduling strategy if only one task remains in the system, and its deadline is earlier
than the arrival of a new task, the CPU is slowed down such that the expected end time (based on predicted remaining run time) of the task equals its allowed deadline
otherwise the CPU runs at maximum speed
Current Status
Simulation tool for RTOS power management evaluationPARSEC discrete event simulation languageTwo communicating entities:
Task Generator generates task instances with run times according to a trace or a
distribution RTOS
sets CPU speed by setting voltage and frequency implements runtime predictor
Variety of task sets from literatureNote: non-predictive scheme is obtained by setting predictor
to always return WCET – run time so far.
Implementation in progress in eCoS RTOS
Sample Simulation Results #1(17 Task Set; EDF Scheduling)
Pow er Reduction in Avionics Task Set Using EDF Scheduling
0
20
40
60
10 30 50 70 90
BCET/WCET
% P
ow
er
Red
uct
ion
Low P ower Scheme without P rediction and AdaptionStrategiesLow P ower Scheme with P rediction and withoutAdaption StrategyLow P ower Scheme with P rediction and AdaptionStrategies
Deadlines Missed in Avionics Task Set Using EDF Scheduling
0
0.2
0.4
0.6
10 30 50 70 90
BCET/WCET
% D
eadl
ines
M
isse
d
Low P ower Scheme with P rediction and withoutAdaption StrategyLow P ower Scheme with P rediction and AdaptionStrategies
Sample Simulation Results #2(17 Task Set; RM Scheduling)
Pow er Reduction in Avionics Task Set Using RM Scheduling
0
20
40
60
10 20 30 40 50 60 70 80 90 100
BCET/WCET
% P
ow
er
Re
du
cti
on
Low P ower Scheme without P rediction and AdaptionStrategiesLow P ower Scheme with P rediction and withoutAdaption StrategyLow P ower Scheme with P rediction and AdaptionStrategies
Deadlines M issed in Avionics Task Set us ing RM Scheduling
0
0.2
0.4
0.6
10 30 50 70 90
BCET/WCET
% D
ead
lines
m
isse
d
Low P ower Scheme with P rediction and withoutAdaption Strategy
Low P ower Scheme with P rediction and AdaptionStrategies
Deployable Platform
Sensor 1
Sig
nal
Co
nd
itio
nin
g
Sig
nal
Pro
cess
ing
Lo
gic
&
Co
ntr
ol
Em
bed
ded
R
adio
Power MemoryCurrent WINS Node
2.5” X 2.5” X 4”WINS Node in 20001”x1”x1”
Sensor 2
Sensor 3
Sensor n
Leverage existing Rockwell Wireless Integrated Networked Sensor (WINS) technology based on the StrongARM.
Develop and implement controls and monitors to enable power management by the middleware and RTOS. E.g. cache sleep/on modes, processor sleep/idle/on, and peripheral idle/on modes.
Provide API abstraction to facilitate power management. Advanced power-aware features will be implemented guided by experimental
results obtained from the research platform. Upgrade to higher-speed, lower power next-generation StrongARM when it
becomes available.
JIT Power-aware Communications
AWGN Approximately 10 dB SNR
requirement for 0.001% BER.
Raleigh fading Approximate 45 dB SNR
requirement for 0.001% BER.
0 5 10 15 20 25 30 35 40 45 5010
-8
10-7
10-6
10-5
10-4
10-3
10-2
10-1
100
Bit
Err
or R
ate
E b/N 0 (dB )
A W G N
R ayleigh F ading
Assume 5 dB NF, R4 path loss, 900 MHz carrier frequency, 100 kbps bitrate, and 10 dB link margin.
Transmit power is 12.5 dBm for AWGN case at 100 m. Transmit power is 47.5 dBm for Rayleigh fading with no coding. With coding the transmit power is increased to 22.5 dBm. But the computation overhead is 100X for a K=9 rate ½ convolutional code.
Reconfigurable Power-awareCommunications Techniques
Traditional approaches Point solution, usually designed for the worst-case channel condition Manage power at the link layer only, e.g. power control.
Proposed approach Provides adaptation of the physical layer and supports adaptation at higher
protocol layers (e.g. routing). Utilizes reconfigurable technology (e.g. FPGA). Adapts not only digital processing but also analog processing.
Runtime reconfigurable library (100X power dynamic range) Direct-sequence spread-spectrum modem (adaptable processing gain) FEC coder/encoder: block codes and convolutional codes. Un-equalized QAM, including BPSK and QPSK.
Reconfigurable analog processing (10-20X power dynamic range) Adapt the input bandwidth, spanning a range of 10 kHz to 1000 kHz. Configures mode of power amplifier (Class A and E/F).
Some Potential Operation Scenarios
Typical operation Good channel =>
Uncoded transmission Noisy channel =>
Simple to complex FEC coding and/or interleaving. Decrease BW
Interference channel => Increase processing gain.
Mission critical operation Un-equalized QAM, high BW, and Class A operation. Add FEC and processing gain as needed.
Sentry GMSK with Class E operation. Low BW. Add FEC and processing gain as needed.
Reconfigurable Radio Architecture
B atte ry L ife tim e M on ito r
R econ figu ra tion C on tro lCom m unicationReconfigurable
ModulesStorage
Inte
rfac
e
A pp lica tion
M idd lew are
R TO S
A P I
In te rfaceModem
AnalogInterface
RF
C hanne l M on ito r
Field Demonstrations
Technology Transfer & Commercialization
Collins PLGR LAN provides situation awareness to individual soldiers.
RSC’s Highly Deployable Remote Access (HiDRA)(hidra.rsc.rockwell.com)
Existing as well as future Rockwell products can greatly benefit from the power-aware technology developed under this program.