Disk Drive Roadmap from the Thermal Perspective A Case for Dynamic Thermal Management Sudhanva Gurumurthi Anand Sivasubramaniam, Vivek Natarajan Computer Systems Lab Pennsylvania State University
Disk Drive Roadmap from the Thermal Perspective
A Case for Dynamic Thermal Management
Sudhanva GurumurthiAnand Sivasubramaniam, Vivek Natarajan
Computer Systems LabPennsylvania State University
2
Power Demands of Data Centers“What matters most to the computer designers at Google is not speed but
power – low-power – because data centers can consume as much electricity as a city”, Eric Schmidt, CEO, Google
• Data centers consume several Megawatts of power
• Electricity bill– $4 billion/year– Disks account
for 27% of computing-load costs
• Difficult to cool at high power-densities
Sources:
1. “Intel’s Huge Bet Turns Iffy”, New York Times article, September 29, 2002
2. “Power, Heat, and Sledgehammer, Apr. 2002.
3. “Heat Density Trends in Data Processing, Computer Systems, and Telecommunications Equipment”, 2000.
3
Data Center Cooling Costs
• Data center of a large financial institution in New York City– Power consumption ~ 4.8 MW
Source: “Energy Benchmarking and Case Study – NY Data Center No. 2”, Lawrence Berkeley National Lab, July 2003.
51%42%
7%
Servers Air-Conditioning Other
4
Temperature Affects Disk Drive Reliability
• Heat-Related Problems– Thermal-tilt of disk stack and actuator arms– Out-gassing of spindle/voice-coil motor
lubricants– Wear-out of bearings
• Hard disk operating 5 C above normal temperature 10-15% more likely to fail
Disk drive design constrained by the thermal-envelope
5Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html
6
Power =~ (# Platters)*(RPM)2.8*(Diameter)4.6
Increase RPM
Thermal-Constrained Design
Increase RPM
Lower Capacity
Shrink Platter
1 platterData Rate =~ (Linear-Density)*(RPM)*(Diameter)
(RPM)2.8 (Dia)4.6 (# Platters)
Lower Data Rate
Data-Rate Capacity
Temperature
40% AnnualIDR Growth
Can we stay on this roadmap?
7
Outline
• Introduction
• Modeling
• The Roadmap
• Dynamic Thermal Management
• Conclusions
8
Modeling
• Baseline input parameters– Linear-Density (BPI)– Track-Density (TPI)
• Characteristics Modeled– Capacity– Performance– Temperature
9
Capacity Model
• Cmax = ηxnsurfxπ(ro2-ri
2)(BPIxTPI)
• Stroke-Efficiency: η < 1– Spare tracks, recalibration tracks etc.– Assumed η = 2/3 [CMRR]
• User-accessible capacity needs to be derated due to:– Zoned-Bit Recording (ZBR)– Servo Overheads– ECC Overheads
10
Performance Models
• Parameters Modeled– IDR– Seek-time
• IDR– IDR experienced by outermost zone
• Seek-time– Uses linear-interpolation based on track-to-track,
average, and full-stroke times [Worthington’95]
– Accurate for seeks longer than 10 cylinders
11
Validation
• Compared modeled vs. actual capacity and IDR using 13 disks from 4 different manufacturers from 1999-2002
• Inputs: BPI, TPI, RPM, Platter-size, Number of platters
• Assumed all disks have 30 zones.
12
Performance Model ValidationModel Year Actual
Capacity (GB)
Model Capacity (GB)
Actual IDR (MB/s)
Model IDR (MB/s)
Quantum Atlas 10K
1999 18 17.6 39.3 46.5
Seagate Cheetah X15
2000 18 20.1 63.5 73.6
IBM Ultrastar 73LZX
2001 36 34.7 86.3 85.2
Fujitsu AL-7LE
2001 73 67.6 84.1 88.1
Seagate Cheetah 15K.3
2002 73 74.8 111.4 114.4
13Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html
14
Change in BPI and TPI Trends
• Slowdown in BPI– Difficult to lower fly-height– Requires higher recording media coercivity– Smaller grain sizes suffer from superparamagnetic effects
• Slowdown in TPI– Narrower tracks more susceptible to media noise– Inter-track interference– Increase in track edge-effects with narrower tracks
• Bit-Aspect Ratios (BPI/TPI) dropping– Larger slowdown in BPI
• Long-term areal density growth expected to slowdown to 40-50% – 1 Tb/in2 disk expected to be available in 2010 [DS2]
15
Capturing BPI and TPI Trends
• Studied published work on designing Terabit areal-density disks.
• Chose design with most conservative assumptions about BPI
• Scaled BPI and TPI CGRs to achieve 1 Tb/in2 areal density in 2010– BPI CGR = 14%– TPI CGR = 28%– Areal-density CGR = 46%
16
Thermal Model
• Extension of work by Eibeck et al. at the University of California
• Components Modeled:– Internal air– Spindle-assembly– Arm-assembly– Drive base and cover
• Drive completely enclosed• External temperature maintained constant
17
Modeling the Heat-Transfer
• Newton’s Law of Cooling:
dQ/dt = hAΔT • Internal Air Heat = Heat convected by
solid components + viscous dissipation – heat lost through drive cover
18
Drive Parameters
• Materials– Proprietary data– Assumed platters, arms, and spindle-hub composed
of Aluminum
• Geometry– Modeling and measurement
• Voice-coil motor (VCM) power– Used published data from IBM [Sri-Jayantha’95]
• External air temperature– Assumed 28 C for single-platter configuration
19
The Thermal-Envelope
28
33
38
43
48
1 5 10 15 20 25 30 35 40 45 50
Time (Mins.)
Tem
per
atu
re (
C)
Thermal Envelope
20
Outline
• Introduction
• Modeling
• Formulating a Disk-Drive Roadmap
• The Roadmap
• Dynamic Thermal Management
• Conclusions
21
Drive RPM
0
50000
100000
150000
200000
250000
Year
RP
M
2.6"
2.1"
1.6"
BPI CGR = 30%
TPI CGR = 50%
BPI CGR = 14%
TPI CGR = 28%
Areal Density ≥ 1 Tb/in2
22
Drive Temperature
10
100
1000
Year
Tem
pera
ture
(C
)
2.6" 2.1" 1.6"
Thermal-Envelope
23
24
25
Outline
• Introduction
• Modeling
• Formulating a Disk-Drive Roadmap
• The Roadmap
• Dynamic Thermal Management
• Conclusions
26
Dynamic Thermal Management (DTM)
• To boost performance while still working within the thermal-envelope by dynamic activity-control
• How much do higher RPMs benefit application I/O performance?
27
Applications Studied
• Five commercial I/O traces– Openmail (HP Labs)
– OLTP Application (UMass Repository)
– Web Search-Engine (UMass Repository)
– TPC-C (Penn State)
– TPC-H (IBM Research)
• Attempted to re-create the disk-system on which the trace was collected in DiskSim
28
30-60% Performance Boostfor 10,000 RPM Increase
29
Search-Engine - Thermal BehaviorThermal Envelope = 45.22 C
30
DTM Solution 1:Exploiting Thermal Slack
T
E
M
P
E
R
A
T
U
R
E
TIME
Thermal-EnvelopeSPM+VCM On
VCM Off
RPMThermal Slack
31
Thermal Slack
0
10000
20000
30000
40000
50000
60000
RP
M
2.6 2.1 1.6
Platter-Diameter (inches)
Envelope-Design VCM Off
32
33
DTM Solution 2:Activity Throttling
• Thermal-design assuming an average-case operation
• Basic idea– Disk services requests at its peak-
performance configuration– Throttle disk activities if thermal-envelope may
be exceeded
34
Approach 1:Seek Throttling
T
E
M
P
E
R
A
T
U
R
E
TIME
Thermal-Envelope
VCM On
VCM Off
35
Approach 2:(Seek+RPM) Throttling
T
E
M
P
E
R
A
T
U
R
E
TIME
Thermal-Envelope
VCM On
VCM Off
VCM Off+
Low RPM
36
Throttling-Ratio
Seek Throttling2.6", 24,534 RPM
0
0.5
1
1.5
2
0.5 1 2 4 6 8
tcool (secs)
Th
rott
lin
g-R
atio
(Seek+RPM) Throttling2.6", 37,001 RPM
00.20.40.60.8
11.21.41.61.8
0.5 1 2 4 6 8
tcool (secs)
Th
rott
lin
g-R
atio
• tcool – Disk undergoing throttling• theat – Disk operating at maximal performance configuration• Throttling-Ratio = (theat/tcool)
2.6” 40% IDR Growth to 2005 2.6” 40% IDR Growth to 2007
37
Summary
• Need aggressive RPM increases to sustain IDR growth– Scaling BPI and TPI more difficult– Lower Signal-to-Noise ratios at higher densities
increase ECC overheads• IDR growth would get affected due to heat dissipation
– 40% growth rate cannot be maintained beyond 2007 even for 1.6” platter-size
– Expected to slowdown to 14%• Possible to buy back performance with Dynamic
Thermal Management (DTM).