Top Banner
Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University Facebook, Inc.
18

Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Mar 27, 2015

Download

Documents

Brianna MacKay
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Exploring the Potential of CMP Core Count Management

on Data Center Energy Savings

Ozlem Bilgir*

Margaret Martonosi*

Qiang Wu†

*Princeton University †Facebook, Inc.

Page 2: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Low Utilization• Example: 1-week CPU utilization for Facebook servers• CPU utilizations stay below 40% most of the time

High Idle Power• Idle power can be as much as 60% of max power• Waste of power

Data Center InefficienciesC

PU

Uti

liza

tion

(%

)

Page 3: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Previous Work on Data Center Power Management Server Consolidation [Pinheiro et al. 2003, Chen et al. 2008, etc.]

Dynamic Voltage Frequency Scaling [Chen et al. 2005, Bertini et al. 2010, etc.]

Power-gating in CMPs [Leverich et al. 2009, Madan et al. 2010, Kumar et al. 2009]

Server reboots are not very desirable because of

o Frequent code pushes, importance of robustness, high boot latencies, etc.

DVFS leverage is decreasing because

o Operating voltages are decreasing, prominence of leakage power

Reducing core power with power-gating will be used in this work

Abstraction layer between latency and core count

Page 4: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Our Envisioned System

Front End Device

N1 N2

N3 N4

Server 1 Server 2 Server 3 Server 4

Assume 1 core in each server is ON

ta Service times vary with exponential distribution

td

Latency = td – ta

Latency = Queuing Time + Service Time

Set of Research Questions:• How many cores?• Which cores?• How are answers affected by boot time, bursts, etc.?

Page 5: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Our Approach: Multi-Server CMP Core Count Management Consolidate load to only a subset of cores in

multi-server CMP systems Put the other cores in low power state Decide which cores to keep ON at a global

level Constraint: Latency requirement is satisfied

Page 6: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Outline

Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology Results Conclusion

Page 7: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Decision of ON Core Count

• Look-up table based ON core count decision maker• Gives total necessary number of ON cores for a

given latency goal and an observed load rate• ON core count of servers change at every period, T• Created beforehand by observing obtained latency

at different load rate and ON core count

Core Count 1 2 3 . . . 16

75% Latency Goal

150msec 0% 4% 9% … 86%

200msec 1% 7% 13% … 97%

250msec 3% 9% 14% … 100%

10%

Page 8: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Core Count Management Alternatives• Round-Robin Scheme

• Same Server Scheme

• Chip Turn On/Off Scheme

− Resource contention effect is low

- Opportunity to dedicate empty servers to other applications

Better performance under bursts

1 5 2 3 4

3

1 2

4

5

4 8

Page 9: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Power and Performance Models

Power

Performance• Per-core service time is affected by contention

• Cratio is contention ratio

Pcores = PbusyCores + PidleCores

ratiobusybase CNtimeServicetimeService )1(__

Ptotal = PnonCores + Pcores

Page 10: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Outline

Motivation System and Research Overview Decision of ON Core Count Core Count Management Alternatives Performance and Power Models Methodology

• Workload• Parameters

Results Conclusion

Page 11: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Real data from Facebook

Stochastic Data

• Exponential dist. with stable means at 5%, 40% and 85%

Workload

Time

Page 12: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

All Parameters

Explanation Value

Server Count 4

Core Count per Server 4

Base Service Time 100msec

Service Time Distribution Exponential Distribution

Inter-arrival Time Distribution

Exponential Distribution

Control Period 10min

Contention Ratio 0%, 15%

Idle Core Power 40%

75% Latency Goal 250msec, 150msec

Page 13: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Results: Effect of Load Rate on Energy

Energy savings are greater at low load rates• For 5% load, 80% savings• For Facebook load, 35% savings

At high load rates, most cores are ON, hence less savings

SS and RR behave same because Cratio =0

Page 14: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Results:Effect of Load Rate on Latency

All of them satisfy the latency goal

Obtained latencies are much better than the latency goal• With fewer cores, obtained latency would exceed latency goal

Page 15: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Results: Effect of Contention on Energy

Energy consumption is affected by contention more in SS

• Increase in Same-Server is 30%

• Increase in Round-Robin is13%

Page 16: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Results: Effect of Latency Goal on Energy

Tighter latency goal -> More cores ON

At 150msec,• SS energy consumption increase by 10%• CTO energy consumption increases by 6%

Page 17: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Conclusion

Using stochastic simulation and real workload, a range of core count management issues in CMPs explored

Our CMP core count control mechanism connects high level information (latency) to low-level power management (core count)

35% core energy can be saved with Facebook workload

Page 18: Exploring the Potential of CMP Core Count Management on Data Center Energy Savings Ozlem Bilgir * Margaret Martonosi * Qiang Wu * Princeton University.

Exploring the Potential of CMP Core Count Management

on Data Center Energy Savings

Ozlem Bilgir*

Margaret Martonosi*

Qiang Wu†

*Princeton University †Facebook, Inc.