1 A Run-Time Feedback Based Energy Estimation Model for Embedded Systems Selim Gürün Chandra Krintz Department of Computer Science U.C. Santa Barbara International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS) Seoul, Korea October 22-25, 2006
26
Embed
1 A Run-Time Feedback Based Energy Estimation Model for Embedded Systems Selim Gürün Chandra Krintz Department of Computer Science U.C. Santa Barbara International.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
A Run-Time Feedback Based Energy Estimation Model for
Embedded Systems
Selim GürünChandra Krintz
Department of Computer ScienceU.C. Santa Barbara
International Conference on Hardware/Software Codesign and System Synthesis (CODES-ISSS)
Seoul, KoreaOctober 22-25, 2006
CODES-ISSS’06 2
Power-Aware Execution: Big Picture
• Power-aware methods divide task execution into operations, and prepare an execution plan for each• Operation: smallest user-visible unit of
execution • Typical operation: Rendering a scene,
translating a sentence, calculating a shortest path in a map
• Need to know energy cost of each plan
Knowing future energy cost of operations requires profiling them at run-time
IdentifyOperations
Profile atRuntime
PredictFuture Costs
Develop Power-Aware Execution Strategy
CODES-ISSS’06 3
Outline
• Extant run-time power profiling techniques• Power profiling methodologies for
embedded computers
• Proposed model• Overview• Model construction• Capturing system dynamics
• Evaluation
• Summary and Conclusion
CODES-ISSS’06 4
Run-Time Energy Profiling: Overview
OS Interfaces like ACPI:
+ Provides simple API to battery voltage sensors+ Ok for different hw. power levels - Very coarse- Not precise
Execution Time:
+ Simple to measure
+ Fast and precise- Not correlated to power- Not suitable when hw.
power levels change: DVS, sleep
HPMs:
+ Fast access
+ Quite accurate- Architecture dependent- Not designed for power
estimation --many events missing
CODES-ISSS’06 5
Run Time Energy Profiling: HPMs
• CPU counters provide unparalleled insight into program behavior• Cache, TLB misses• Instructions executed per cycle (IPC)
• How can we accurately gather program energy consumption by monitoring key parameters?
Use HPMs as pseudo CPU component access counters:Energy Consumption = I Cache * a0 + D Cache * a1 +
ALU * a2 +…
CODES-ISSS’06 6
Run Time Energy Profiling: HPMs
• CPU counters provide unparalleled insight into program behavior• Cache, TLB misses• Instructions executed per cycle (IPC)
• How can we accurately gather program energy consumption by monitoring key parameters?
• Useful but not enough:• CPU consumes a portion of total energy; power-aware
strategies need to know full picture.• Fails when hardware changes its behavior: DVS, sleep states
• A different strategy needed!
Use HPMs as pseudo CPU component access counters:Energy Consumption = I Cache * a0 + D Cache * a1 +
ALU * a2 +…
CODES-ISSS’06 7
Proposed Energy Profiling Model
Construct Power Model
Update Model Coefficients
Measure Energy Consumption in Large Intervals and Compare to
HPM Model Estimates
Determine Model
Coefficients
Predict Energy Consumption using HPM Model
Power-Efficient Execution Plan
Offline Analysis
Continuous model improvement at run-time
Fine-Grain Energy Estimation
CODES-ISSS’06 8
Case Study: Intel XScale on Stargate
• 32 bit XScale – 400MHz• 64 MB RAM• Runs Familiar Linux• No Display• Wireless 802.11• Compact Flash
XScale Major HPM Events
Inst/Data cache misses
Data dependency stalls
Inst/Data TLB misses
Brach mispredicted
Instruction executedSCL
CODES-ISSS’06 9
Constructing Model
• Are there any correlations between HPM values and full system power consumption?• Absolutely! --but some challenges exist.
• Good correlation in memory/CPU subsystem• High IPC -> CPU intensive application • High cache misses/hits -> memory intensive application
• But I/O is the problem!• Some heuristics possible, e.g. Low memory activity and low IPC
-> possible I/O wait state • Better to use software counters embedded into drivers
CODES-ISSS’06 10
Model Coefficients
E = X1 a1 + X2 a2 + X3 a3 +…
XI : Independent VariablesaI : Coefficients
• Estimate coefficients using least squares linear regression (LSQ)• Stable and simple• Linearity assumption
Only Major All related
LSQ Model: Which variables?
Efficient, clear Easier to understand Less accurate
More accurate Run-time overhead Modeling difficulties
due to variable dependencies
CODES-ISSS’06 11
Parameter Selection & Dependencies
• Hard to include all variables: Too many parameters clutter model• Parameter dependencies unstable parameter estimations• E.g. Volume = a0 + a1 * pounds + a2 * grams
• Work-around is non-trivial; HPM characteristics e.g.:• TLB miss more CPU cycles & cache miss• Memory Stall Fewer instruction executed
Core Clock CyclesBytes TransmittedBytes ReceivedPackets TransmittedPackets Received
CODES-ISSS’06 13
Run-Time Model Improvement
• Global coefficients• Compute using off-line model
• Continuously update coefficients• Improve using most recent
data• Gradually phase out previous
measurements
• Recursive least squares with exponential decay• Smaller decay factor-> more
agile
Global Coefficients
Measure Power
Update with RLS
Model Parameters: Decay factor Update period Measurement error
CODES-ISSS’06 14
Feedback Source: DS 2760
• Measures current flow in and out of battery
• Internally: A small A/D converter attached to a high precision internal resistor
Pros/Cons:+ Highly Available
e.g. iPAQs, sensor network gateways, cell phones
- Not precise enough for monitoring task energy consumption
0.25 mAh error in each reading
- Slow, one-wire serial interface
CODES-ISSS’06 15
Stargate and Our Evaluation Bench
PowerTool
VPerfmon
VPMonSCL High-precision
Data Acquisition Device
ProgrammablePower Supply
CODES-ISSS’06 16
Methodology
• Collect energy consumption every so often• Every 10 million instructions ( a so-called interval)
• Validate model accuracy on imprecise measurement data• Inject uniformly distributed random error • Evaluate various precision (error) levels: 1X – 8X• Predict energy consumption of each interval• Continuously improve model parameters every 10M * K
intervals
• Use a large group of workload• Computational benchmarks• Computational + communication oriented benchmarks
CODES-ISSS’06 17
Static vs. Adaptive Models
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
bisor
t
em3d
gsm
deco
de
gsm
enco
de
jpegde
code
jpegen
code life
mpe
g2dec
ode
mpe
g2enc
ode
pvkx
pvkx
bpv
nx
treea
dd
AVERAGE
Err
or
%
0.9
Static
CODES-ISSS’06 18
Average Error Rates
Interval Size
1X 2X 4X 8X Best
100 53.3% 29.9% 14.5% 16.9% 3.8%
200 68.8% 24.1% 12.9% 7.3% 2.7%
400 33.0% 22.0% 9.1% 7.7% 2.8%
Error rates and Interval sizes –Simple Model
Measurement Precision
CODES-ISSS’06 19
Average Error Rates-Complex Model
Interval Size
8X 8X Best Best
100 16.9% 28.1% 3.8% 4.3%
200 7.3% 33.3% 2.7% 3.8%
400 7.7% 24.0% 2.8% 4.1%
Measurement imprecision reduce complex model quality more than the simple one!
Simple Model
CODES-ISSS’06 20
Related Work
• High-End CPU Power Models• Define CPU component access rate using HPM access heuristics• OS calls power consumption as a function of IPC
• Embedded CPU Power Models• Five HPM counters for XScale• Also evaluated memory model
• Memory models• UltraSparc memory subsystem
• All above are static models
• Power profiling setups • Powerscope
CODES-ISSS’06 21
Summary & Conclusions
• Our Goal: An accurate, efficient run-time power profiling system• Hardware counters are key
• Define software counters for I/O • Smart battery monitors expose dynamics in power behavior• We propose a hybrid system that combine both
• Lessons learned• Dynamic models are much better than static ones in power
modeling• Models should decay old measurements conservatively when
measurement errors are present• Measurement errors in the presence of multicollinearity can be