Adaptive Video Coding to Reduce Energy on General Purpose Processors
Post on 02-Jan-2016
27 Views
Preview:
DESCRIPTION
Transcript
Adaptive Video Coding to Reduce Energy on General Purpose Processors
Daniel Grobe Sachs,Sarita Adve, Douglas L. JonesUniversity of Illinois at Urbana-Champaign
http://www.cs.uiuc.edu/grace
grace@cs.uiuc.edu
Introduction
Wireless multimedia increasingly common
Recent advances reduce constraints: 2GHz+ processors High-speed wireless networks
Systems now Energy limited Energy management essential
Adaptation
Adaptation key to energy management Hardware adaptation already common Software adaptation also possible
Challenges How do we control adaptations? How do we coordinate different
adaptations?
GRACE Project
Target mobile multimedia devices. Coordinated adaptation of all system layers
Hardware, application, network, OS Complete cross-layer adaptation framework
Preserves separation between layers
Goals of this work
Target wireless video transmission Adapt application: Adaptive video encoder Adapt hardware: Adaptive CPU
Implement part of GRACE framework Trade off between CPU and network energy
Contributions
Apply existing adaptive-CPU research Energy-adaptive video encoder
Trades off between network, CPU Allows adaptation with fixed QoS
Cross-layer adaptation framework Coordinate app and CPU adaptation Preserves logical separation between
layers 20% Energy savings over existing systems
Presentation Overview
System model System architecture and design Cross-layer adaptation process Results
System Model
Total Energy = CPU Energy + Network Energy
Adaptive CPU
AdaptiveVideo Encoder
Control
WirelessNetwork
•Video Capture
CPU Hardware Adaptation [Micro]
Reduce performance to save energy Voltage and frequency scaling
Lower freq lower voltage lower energy
Architecture adaptation Issue width Active functional units (ALUs, etc.) Instruction window size
Adaptive Encoder
Based on TMN H.263 encoder Changed to logarithmic motion search
Encoder adapts for energy Trade off between network and CPU
energy More computation fewer bits
Adapt Motion Search and DCT Computationally expensive Elimination affects primarily rate
Adaptive Encoder Details
Motion Search and DCT thresholds Terminate MS early when SAD under
threshold Skip DCT if SAD of block under threshold
Transmit “DCT flag” bit for each 8x8 block Extends H.263 standard
Adaptation effect: Setting thresholds at infinity
Reduces CPU load by ~50% Increases data rate by 2x or more
Adaptation Control
When do we adapt?
What configurations do we choose?
Adaptation Control
When do we adapt? Adapt before every frame
What configurations do we choose?
Adaptation Control
When do we adapt? Adapt before every frame
What configurations do we choose? Must minimize total CPU+network
energy Must complete frame within its allocated
time
Adaptation Control
When do we adapt? Adapt before every frame
What configurations do we choose? Must minimize total CPU+network
energy Must complete frame within its allocated
time How do we find the optimal
configurations?
Optimization
Application, CPU reconfiguration linked Application reconfiguration changes workload CPU reconfiguration changes performance App config affects optimal CPU configuration
… and vice versa Two stage approach
1. For each app config, find CPU config, energy2. Pick lowest-energy application configuration
Optimization Algorithm
1. For each app config, find Best CPU config
CPU energy
Network energy
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Optimization Algorithm
1. For each app config, find Best CPU config
– completes in time, with least energy [MICRO’01]
CPU energy
Network energy
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Optimization Algorithm
1. For each app config, find Best CPU config
– completes in time, with least energy [MICRO’01]
CPU energy
Network energy
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Requires instruction count
Optimization Algorithm
1. For each app config, find Best CPU config
– completes in time, with least energy [MICRO’01]
CPU energy
= Instruction count x Energy per instruction [MICRO’01]
Network energy
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Requires instruction count
Optimization Algorithm
1. For each app config, find Best CPU config
– completes in time, with least energy [MICRO’01]
CPU energy
= Instruction count x Energy per instruction [MICRO’01]
Network energy
= Byte count x Energy per byte [WaveLAN measured]
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Requires instruction count
Optimization Algorithm
1. For each app config, find Best CPU config
– completes in time, with least energy [MICRO’01]
CPU energy
= Instruction count x Energy per instruction [MICRO’01]
Network energy
= Byte count x Energy per byte [WaveLAN measured]
Total energy = CPU energy + network energy
2. Pick app config with lowest total energy
Requires byte count
Requires instruction count
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
App configuration energy table
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
App configuration energy table
Find CPU ConfigurationCPU Optimizer
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
App configuration energy table
CPU EnergyEstimator
Predict CPU Energy
Predict Net Energy
Find CPU Configuration
Network EnergyEstimator
CPU Optimizer
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
+Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
App configuration energy table
CPU EnergyEstimator
Predict CPU Energy
Predict Net Energy
Find CPU Configuration
Network EnergyEstimator
CPU Optimizer
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
+Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
CPU EnergyEstimator
Predict CPU Energy
Predict Net Energy
Find CPU Configuration
Network EnergyEstimator
CPU Optimizer
Adaptation Process: Stage 1
App. Conf. 1
CPU NetPredict Next Instr. Count
Predict Next Byte. Count
+Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
CPU EnergyEstimator
Predict CPU Energy
Predict Net Energy
Find CPU Configuration
Network EnergyEstimator
CPU Optimizer
Adaptation Process: Stage 2
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
Adaptation Process: Stage 2
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
Pick Lowest Energy
Adaptation Process: Stage 2
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
Pick Lowest Energy
CPUAdaptor
Chosen Configuration
ApplicationAdaptor
Adaptation Process: Stage 2
Conf 1Energy
Conf 2Energy
Conf 3Energy
. . . Conf nEnergy
Pick Lowest Energy
CPUAdaptor
Chosen Configuration
ApplicationAdaptor
Capture, Encode, and Transmit Frame
Predictors
How do we predict instructions and bytes? Fixed software use previous frame data Adaptive software no longer works!
Solution: Offline profiling Encode reference sequences offline Transition randomly between app. configs Fit predictors to transitions between configs
Map last instruction, bytes to new app. config Linear, 1st-order predictors
Experiments
RSIM CPU simulator State-of-the-art CPU, memory Princeton Wattch energy model Reported energy typical of modern CPUs
Simulation Conditions: Fixed and adaptive CPU Fixed and adaptive software Foreman sequence
Fixed vs Adaptive Systems
Adaptive hardware saves 70% over fixed system Adaptive application saves
30% on fixed hardware 20% on adaptive hardware (total savings of 80%)
•0
•5
•10
•15
•20
•25
•30
•35•30.49
•21.23
•7.36 •6.25
Net
CPU
Adaptive H/W
Adaptive S/W
Adaptive Sys
Fixed System
•En
erg
y (
J)
Algorithm Comparison
Baseline: Fixed software, adaptive hardware Adaptive software:
Adaptive DCT/motion thresholds Instruction, byte count for next frame predicted
Oracle Instruction and byte count for next frame exact
Adapt-Once Adapt once at start of encoding Minimize total energy across entire sequence
•0
•2
•4
•6
•8 •7.36•6.55
•6.09•6.25
Algorithm Comparison•En
erg
y (
J) Net
CPU
Adapt Once
Fixed
Adaptive
Oracle
Energy consumption of Adaptive within 3% of Oracle Simple predictors sufficient for energy savings
Adaptive saves 5% over Adapt-Once Frame-by-frame adaptation can save energy
Other test cases
Low Power CPU Network energy dominated Software adaptation did not save energy
Carphone Little inter-frame variation One-shot adaptation was sufficient Adapt-Once, Adaptive, Oracle same
energy Adaptive software saved ~15%
Conclusions
A new framework for coordinated CPU/application adaptation Combined benefits of both adaptations Preserves separation between layers
Adaptive applications save energy: Up to 20% on adaptive hardware Up to 30% on fixed hardware
top related