Keeping Hot Chips Cool Thermal Management for Green Computing Yang Ge Professor Qinru Qiu
Dec 19, 2015
utline
• Background– Need for green computing– Adverse effects of high temperature– Thermal management techniques
• Ongoing project– Power and thermal management for single chip
cloud computer (SCC)
The need for green computing• Computers consume 3%
of US energy use– Saving 1% of energy of
data center is more than saving a power plant
• Each computer generates 1 ton of CO2 every year– Equivalent to the CO2
emission of a car driving a round trip between New York and Los Angeles
Power and Cost for Cooling Systems
• The energy dissipation for cooling system is high– Cooling fan power can reach
up to 51% of the overall server power budget
• The cooling cost is expensive in large data centers– The total cooling costs for
large data centers can run into tens of millions of dollars
Fans
CPU
Mem
OtherFans 51%
Mem 20%
CPU 24%
Other 6%
IBM P670 Server power breakdown
Adverse effects of high temperature to VLSI Chips
• Affects the system reliability and causes permanent device failure
• Doubles leakage power consumption every 9oC increase
• Requires to increase fan speed which could reduce fan life time
Thermal Management Techniques
Offline Techniques
Online Techniques
Temperature aware scheduling
Dynamic voltage frequency scaling
Temperature aware task migration
• 24 tiles arranged in 6X4 arrays
• 2 CPUs on each tile
• A router associated with each tile
• 4 memory controllers go to on board memory
Overview of SCC Architecture
• SCC and MCPC communicates over PCIe bus
• MCPC runs Ubuntu 10.04 x64 and SW from Intel
• Load Linux image on each core
• read and modify SCC registers
• Load programs on the SCC cores.
Management Console PC (MCPC)
• 6 voltage domains• 24 Frequency
domains, one for each tile
• 2 temperature sensors on each tile
• Voltage and frequency can be changed separately on each domain
Power and Thermal Management