Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu † * Princeton University † Facebook, Inc.
Mar 27, 2015
Exploring the Potential of CMP Core Count Management
on Data Center Energy Savings
Ozlem Bilgir*
Margaret Martonosi*
Qiang Wu†
*Princeton University †Facebook, Inc.
Low Utilization• Example: 1-week CPU utilization for Facebook servers• CPU utilizations stay below 40% most of the time
High Idle Power• Idle power can be as much as 60% of max power• Waste of power
Data Center InefficienciesC
PU
Uti
liza
tion
(%
)
Previous Work on Data Center Power Management Server Consolidation [Pinheiro et al. 2003, Chen et al. 2008, etc.]
Dynamic Voltage Frequency Scaling [Chen et al. 2005, Bertini et al. 2010, etc.]
Power-gating in CMPs [Leverich et al. 2009, Madan et al. 2010, Kumar et al. 2009]
Server reboots are not very desirable because of
o Frequent code pushes, importance of robustness, high boot latencies, etc.
DVFS leverage is decreasing because
o Operating voltages are decreasing, prominence of leakage power
Reducing core power with power-gating will be used in this work
Abstraction layer between latency and core count
Our Envisioned System
Front End Device
N1 N2
N3 N4
Server 1 Server 2 Server 3 Server 4
Assume 1 core in each server is ON
ta Service times vary with exponential distribution
td
Latency = td – ta
Latency = Queuing Time + Service Time
Set of Research Questions:• How many cores?• Which cores?• How are answers affected by boot time, bursts, etc.?
Our Approach: Multi-Server CMP Core Count Management Consolidate load to only a subset of cores in
multi-server CMP systems Put the other cores in low power state Decide which cores to keep ON at a global
level Constraint: Latency requirement is satisfied
Outline
Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology Results Conclusion
Decision of ON Core Count
• Look-up table based ON core count decision maker• Gives total necessary number of ON cores for a
given latency goal and an observed load rate• ON core count of servers change at every period, T• Created beforehand by observing obtained latency
at different load rate and ON core count
Core Count 1 2 3 . . . 16
75% Latency Goal
150msec 0% 4% 9% … 86%
200msec 1% 7% 13% … 97%
250msec 3% 9% 14% … 100%
10%
Core Count Management Alternatives• Round-Robin Scheme
• Same Server Scheme
• Chip Turn On/Off Scheme
− Resource contention effect is low
- Opportunity to dedicate empty servers to other applications
Better performance under bursts
1 5 2 3 4
3
1 2
4
5
4 8
Power and Performance Models
Power
Performance• Per-core service time is affected by contention
• Cratio is contention ratio
€
Pcores = PbusyCores + PidleCores
ratiobusybase CNtimeServicetimeService )1(__
€
Ptotal = PnonCores + Pcores
Outline
Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology
• Workload• Parameters
Results Conclusion
Real data from Facebook
Stochastic Data
• Exponential dist. with stable means at 5%, 40% and 85%
Workload
Time
All Parameters
Explanation Value
Server Count 4
Core Count per Server 4
Base Service Time 100msec
Service Time Distribution Exponential Distribution
Inter-arrival Time Distribution
Exponential Distribution
Control Period 10min
Contention Ratio 0%, 15%
Idle Core Power 40%
75% Latency Goal 250msec, 150msec
Results: Effect of Load Rate on Energy
Energy savings are greater at low load rates• For 5% load, 80% savings• For Facebook load, 35% savings
At high load rates, most cores are ON, hence less savings
SS and RR behave same because Cratio =0
Results:Effect of Load Rate on Latency
All of them satisfy the latency goal
Obtained latencies are much better than the latency goal• With fewer cores, obtained latency would exceed latency goal
Results: Effect of Contention on Energy
Energy consumption is affected by contention more in SS
• Increase in Same-Server is 30%
• Increase in Round-Robin is13%
Results: Effect of Latency Goal on Energy
Tighter latency goal -> More cores ON
At 150msec,• SS energy consumption increase by 10%• CTO energy consumption increases by 6%
Conclusion
Using stochastic simulation and real workload, a range of core count management issues in CMPs explored
Our CMP core count control mechanism connects high level information (latency) to low-level power management (core count)
35% core energy can be saved with Facebook workload
Exploring the Potential of CMP Core Count Management
on Data Center Energy Savings
Ozlem Bilgir*
Margaret Martonosi*
Qiang Wu†
*Princeton University †Facebook, Inc.