Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Exploring the Potential of CMP Core Count Management

on Data Center Energy Savings

Ozlem Bilgir*

Margaret Martonosi*

Qiang Wu†

*Princeton University †Facebook, Inc.

Low Utilization• Example: 1-week CPU utilization for Facebook servers• CPU utilizations stay below 40% most of the time

High Idle Power• Idle power can be as much as 60% of max power• Waste of power

Data Center InefficienciesC

PU

Uti

liza

tion

(%

)

Previous Work on Data Center Power Management Server Consolidation [Pinheiro et al. 2003, Chen et al. 2008, etc.]

Dynamic Voltage Frequency Scaling [Chen et al. 2005, Bertini et al. 2010, etc.]

Power-gating in CMPs [Leverich et al. 2009, Madan et al. 2010, Kumar et al. 2009]

Server reboots are not very desirable because of

o Frequent code pushes, importance of robustness, high boot latencies, etc.

DVFS leverage is decreasing because

o Operating voltages are decreasing, prominence of leakage power

Reducing core power with power-gating will be used in this work

Abstraction layer between latency and core count

Our Envisioned System

Front End Device

N1 N2

N3 N4

Server 1 Server 2 Server 3 Server 4

Assume 1 core in each server is ON

ta Service times vary with exponential distribution

td

Latency = td – ta

Latency = Queuing Time + Service Time

Set of Research Questions:• How many cores?• Which cores?• How are answers affected by boot time, bursts, etc.?

Our Approach: Multi-Server CMP Core Count Management Consolidate load to only a subset of cores in

multi-server CMP systems Put the other cores in low power state Decide which cores to keep ON at a global

level Constraint: Latency requirement is satisfied

Outline

Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology Results Conclusion

Decision of ON Core Count

• Look-up table based ON core count decision maker• Gives total necessary number of ON cores for a

given latency goal and an observed load rate• ON core count of servers change at every period, T• Created beforehand by observing obtained latency

at different load rate and ON core count

Core Count 1 2 3 . . . 16

75% Latency Goal

150msec 0% 4% 9% … 86%

200msec 1% 7% 13% … 97%

250msec 3% 9% 14% … 100%

10%

Core Count Management Alternatives• Round-Robin Scheme

• Same Server Scheme

• Chip Turn On/Off Scheme

− Resource contention effect is low

- Opportunity to dedicate empty servers to other applications

Better performance under bursts

1 5 2 3 4

3

1 2

4

5

4 8

Power and Performance Models

Power

Performance• Per-core service time is affected by contention

• Cratio is contention ratio

€

Pcores = PbusyCores + PidleCores

ratiobusybase CNtimeServicetimeService )1(__

€

Ptotal = PnonCores + Pcores

Outline

Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology

• Workload• Parameters

Results Conclusion

Real data from Facebook

Stochastic Data

• Exponential dist. with stable means at 5%, 40% and 85%

Workload

Time

All Parameters

Explanation Value

Server Count 4

Core Count per Server 4

Base Service Time 100msec

Service Time Distribution Exponential Distribution

Inter-arrival Time Distribution

Exponential Distribution

Control Period 10min

Contention Ratio 0%, 15%

Idle Core Power 40%

75% Latency Goal 250msec, 150msec

Results: Effect of Load Rate on Energy

Energy savings are greater at low load rates• For 5% load, 80% savings• For Facebook load, 35% savings

At high load rates, most cores are ON, hence less savings

SS and RR behave same because Cratio =0

Results:Effect of Load Rate on Latency

All of them satisfy the latency goal

Obtained latencies are much better than the latency goal• With fewer cores, obtained latency would exceed latency goal

Results: Effect of Contention on Energy

Energy consumption is affected by contention more in SS

• Increase in Same-Server is 30%

• Increase in Round-Robin is13%

Results: Effect of Latency Goal on Energy

Tighter latency goal -> More cores ON

At 150msec,• SS energy consumption increase by 10%• CTO energy consumption increases by 6%

Conclusion

Using stochastic simulation and real workload, a range of core count management issues in CMPs explored

Our CMP core count control mechanism connects high level information (latency) to low-level power management (core count)

35% core energy can be saved with Facebook workload

Exploring the Potential of CMP Core Count Management

on Data Center Energy Savings

Ozlem Bilgir*

Margaret Martonosi*

Qiang Wu†

*Princeton University †Facebook, Inc.

Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Documents

core count slide

core energy

core count of servers

latency goal slide

core service time

core count decision

msec slide

core count lookup table