GREEN COMPUTING Power Consumption Basics in ICT Products Maziar Goudarzi
Feb 15, 2016
GREEN COMPUTINGPower Consumption Basics
in ICT Products
Maziar Goudarzi
Outline
• Metrics• Energy consumption in ICT products• Some common energy optimization
techniques
Acknowledgements: Some slides/parts from http://www.ida.liu.se/~TDDD50/
Electrical Units
Power Metrics
Performance related energy metrics
• Energy-per-instruction (EPI)– Energy spent to execute an instruction
• Used to compare micro-architectural traits• Sometimes to model software consumption
– Not all the instructions consume the same
• Application energy consumption– Power vs. Time
Comparing CPU energies
• Example: Same program, – AMD CPU, 2GHz, 150W, 10s– Intel CPU, 2.5GHz, 200W, 8sWhich one is better?
• Another (perhaps better) example– Same program– Atom processor, 1.5GHz, 10W, 20s– Core i7 processor, 2GHz, 55W, 5sWhich one is better?
Performance related energy metrics
• Energy delay product (EDP)– Encourages low consumption and fast
runtime– Energy or delay increase → EDP increases
EDP = Watts * runtime2 Energy = Watts * runtimeDelay = runtime
Outline
• Metrics• Energy consumption in ICT products• Some common energy optimization
techniques
Power Consumption Fundamentals
• Most widely used technology today– CMOS (complementary Metal Oxide
Semiconductor) technology– Technology name
• Minimum feature size: 65nm, 45nm, …• Latest technology?
Power Consumption Fundamentals
• Elements of power consumption– Dynamic power
• Dissipated when charging /discharging capacitors
• Inevitable!– Static power
• Leakage• Total waste!• Was negligible until recently• Increased with technology
scaling (<180nm)• 20 to 40% in today processors
• AMD Opteron X2: 300mm wafer, 117 chips, 90nm technology
• Opteron X4: 45nm technology
CMOS Leakage• Transistor is not a perfect digital switch!
– Subthreshold leakage– Gate leakage -> high-k dielectric– Junction leakage
Subthreshold Leakage
• Subthreshold leakage depends on
Outline
• Metrics• Energy consumption in ICT products• Some common energy optimization
techniques– Static power reduction– Dynamic power reduction
Leakage reduction techniques
• Subthreshold leakage depends on
• Architectural techniques to reduce leakage– Stacking effect and gated Vdd– Drowsy effect– Threshold voltage manipulation
Stacking effect and gated Vdd
• Connection of transistors in series source to drain– Reduces the Vds of each
transistor
• Popular stacking technique: Gated Vdd
– Sleep transistor gates the ground (disconnects power supply)
Gated Vdd for SRAM • Dynamically Resized Instruction Cache• Cache decay
– Disable individual lines– Managed with counters to estimate dead lines
– Disabled lines lose the state– Expensive management
Stefanos Kaxiras, Zhigang Hu, Margaret Martonosi, Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power, ISCA, 2001.
Drowsy effect• Voltage-scale of idle memory cells
– Two levels of supply voltage (Vdd and VddLow)– Transistors leak much less than with full Vdd
– No loss of memory state
• High level policies for drowsy caches– No need for complex management mechanisms
• Reading delay (cell voltage scaled back to Vdd)– Worst case are few cycles of delay
– Examples• Simple: whole cache periodically put in drowsy mode• Petit et al.: Simple with heuristics, such as avoid setting the Most Recently
Used (MRU) line to drowsy mode
Threshold voltage manipulation
• The lower the VT, the higher the leakage– Technology scaling enforces
• Reduce Vdd to reduce power consumption and temperature• Reduce VT to reduce delay
• Architectural level techniques– Combination of high-VT and low-VT devices
• High-VT : low leakage, long latency• Low-VT : high leakage, short latency
– Gated-Vdd using a high-VT device
Variable Threshold CMOS• Body Biasing• Body effect to change
device Vth
• Standby leakage reduction with maximum reverse bias
• Triple well structurehttp://mtlweb.mit.edu/researchgroups/icsystems/pubs/tutorials/jkao_2002_iccad_I.pdf
Outline
• Metrics• Energy consumption in ICT products• Some common energy optimization
techniques– Static power reduction– Dynamic power reduction
Capacitance and switching activity
• Capacitance and Switching factor intertwinedP=C V⋅ 2 A f⋅ ⋅
• Capacitance (C)– Fixed at design time– Dependant on
• number of transistors• Interconnections
• Switching activity or factor (A)• Fraction between 0 and 1• Factor of capacitance charged/discharged each CPU cycle
Capacitance• Description of capacitance (Burd and Brodersen)
CL=CW + Cfixed
– CW: Product of technology constant and device width• Optimized at circuit level
– Cfixed: Capacitance of the interconnections• Optimized at architectural level• Reduction of wire length• Effective placement and routing (locality)• Break up large memory banks in smaller chunks
Excess switching activity
• Avoidable charge/discharge activity
• Types– Idle-unit– Idle-width– Idle-capacity– Parallel-speculative– Cacheable– Speculative
Idle-unit switching activity
• Triggered by clock activity in unused units
Idle-width switching activity
• Processor structures wider than needed– Example
• Units with support for 64 bit operands• Most common operations use 16 bit operands
• Solutions– Adapt width of machine according to
operands– Pack multiple narrow-width operations
Width adaptation
Width adaptation
Idle-capacity switching activity
• Over-provisioned processor resources– Resource partitioning or re-sizing
• Grounds– Wire delay increases as technology scale decreases– Long wires imply
• Non affordable delay• High capacitance and consumption
– Buffered wires reduce circuit delay
Complexity-adaptive structures
• Complexity-adaptive structures (Albonesi)– Trade latency & consumption with capacity– Structures become faster as they become smaller
• Solution– Partitions with tri-state buffers
• When structures are reduced– Faster processing– Less energy consumed
– Suitable for SRAM
Parallel speculative switching activity
• Parallel activity is spent for performance– Associative caches
• All but one associative ways fail to produce a hit• All ways are accessed in parallel for speed
– Solution: Smart way access approaches
Phased Cache
Sequential cache
Cache Way Memorization
Upon failure
Voltage-Frequency Scaling
• Basic dynamic power equation:
P = C V⋅ 2 A f⋅ ⋅
– Voltage reduction decreases power by the square of it• Maximum frequency is limited by voltage
– Potential cubic reduction in power dissipation• Considering f and V
– Performance decreases linearly
Dynamic voltage/frequency scaling (DVFS)
• Dynamic adjustment of voltage/frequency– Tradeoff power dissipation / performance
• DVFS decision level– Hardware level
• Exploits different timings of hardware components– Program level
• Program behavior drives decision• E.g. scale down when program knows that has to wait
– System level (OS)• Idleness of the system drives decision• Voltage/frequency scaled to eliminate idle periods
Dynamic voltage/frequency scaling (DVFS)
• Examples of commercial systems– Intel SpeedStep– AMD PowerNow! (for laptops)
• Cool'n'Quiet (for desktop and servers)
• Decision taken at system level
• Changes through specific CPU registerEnhanced Intel ® SpeedStep ® Technology for the Intel ® Pentium ® M Processor (White Paper)http://download.intel.com/design/network/papers/30117401.pdf
تمرین اضافی
روی DVFSروی کامپیوتر شخصی خود •پردازنده را اعمال کرده و میزان مصرف توان
آن را تحت کاربردهای مختلف اندازه گیری نمایید.
میزان مصرف توان پردازنده را جدا از توان •مصرفی دیگر اجزا گزارش کنید.
چه اثری مشاهده می کنید؟ •
Coming Next
• Power Aware Computing• Higher-level power reduction techniques