Top Banner
Xen Power Improvements Will Auld, Yang Z Zhang, Winston Wang Intel Corporation
23

Improving Xen idle power efficiency

Jun 01, 2015

Download

Business

Power management has become increasingly important in large-scale datacenters to address costs and limitations in cooling or power delivery, and it is much critical in mobile client where battery lifecycle is considered as one of the critical characteristics of the platform of choice. Good power management helps to achieve great energy efficiency. Virtualization imposes additional challenge to power management. It involves multiple software layers: VMM, OS, APP. For example, a good OS software stack may result in bad power consumption, if the hypervisor is not the timer unalignment, etc.

In this session, we will introduce what we did to improve power efficiency to achieve better power efficiency in both server and client virtualization environment.

In server side, we will introduce additional optimization technologies (e.g., eliminate unnecessary activities, align periodic timers to create long-idle period), to improve package C6 residency to be within 5% overhead with native. In client side, we will share our client power optimization technologies (e.g. graphics, ATA and wireless), which successfully reduce XenClient idle power overhead to be within 5%.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving Xen idle power efficiency

Xen Power Improvements

Will Auld, Yang Z Zhang, Winston WangIntel Corporation

Page 2: Improving Xen idle power efficiency

Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO

LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.

Intel may make changes to specifications and product descriptions at any time, without notice.

All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.

Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Copyright © 2012 Intel Corporation.

2

Page 3: Improving Xen idle power efficiency

Agenda

• Background

• Power saving in client

• Power saving in server

• Summary

3

Page 4: Improving Xen idle power efficiency

Room to save POWER

• Ideal/standard Native OS power consumption

• Reality Hypervisor power consumption

• LARGE DELTA (~40% for client at start)

4

Page 5: Improving Xen idle power efficiency

Client architecture

5

Hardware

Xen Hypervisor

Win7DomU

VM

LinuxDom0

VM

DomUVM

Client Xen Configuration

Page 6: Improving Xen idle power efficiency

Goal

• Native OS power efficiency• Close the Power gap with Native Win7

CodeDrop

IdentifyGap

RootCause

FixCode

6

Page 7: Improving Xen idle power efficiency

• ~40% idle power gap 2 years ago

• ~5% idle power gap now

• More?

• Increasingly harder to extract

Current results

Project Start Project End0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Idle Power Gap

7

Page 8: Improving Xen idle power efficiency

LCD Display– ~20% idle power

− Broken brightness controls

Fix:−Added emulation of ACPI video extension

− Specifically, brightness control methods _BCL, _BCM, and _BQC

− Added to VM guest ACPI BIOS

− Pass through control knob output to Dom0 take platform action

−Make sure Dom0 LCD brightness is really working

Dom0>Win7>

LCD brightness control

8

Page 9: Improving Xen idle power efficiency

Runtime IO power management

Dysfunctional IO power management

• ~15% Idle power

• 1st available in 2.6.32 kernel, but:− not functioning correctly

Fix:

• Enable energy-saving states at run time and auto suspended when idle

• Gap dropped from ~25% to 6.8% after fix− HP 8440p mobile platform based on Nehalem processor

9

Page 10: Improving Xen idle power efficiency

ATA_link power

ATA_link static power setting− ~6% idle power in max_performance

− But performance suffers with min_power

− Even worse:

−All SCSI hosts active with/without attached devices

Fix: − Runtime update for ATA_link power setting

−Toggle min_power / max_performance, as needed

− Disable clocks on deviceless ports

Run Time

Max_Perf

Mim_Power

10

Page 11: Improving Xen idle power efficiency

Network power

Wired and Wi-Fi− ~16 % idle power (650mw)

− Many interrupts break deep c state during idle

Fix: − Enable Wi-Fi and E1000 power saving mode in Dom0

− Add Win7 power management PV driver to pass control settings to Dom0

Dom0>Win7>

11

Page 12: Improving Xen idle power efficiency

GFX power management

iGFX power management inactive− ~16% idle power (650mw)

− VT-d requires device reset

−Reset clears all regs including BIOS enabled power management regs

− Disables: RC6 (render standby), turbo, and GPMT (Graphics Power Modulation Technology)

Fix: − Save/Restore PM registers around FLR

BIOS ResetPM OFF

PM ON PM ON

Save / Restore

12

VT-d operation

Page 13: Improving Xen idle power efficiency

Client summary

• Started with a ~40% gap

• Ended with ~5% gap

• Greatly improved and got close to the goal

13

Page 14: Improving Xen idle power efficiency

Server power savings -- increasing idle time

• Timer alignment

• Power aware scheduling

• Reducing periodic tasks

14

Page 15: Improving Xen idle power efficiency

Timer alignment

• Independent, frequent timer interrupts

• Frequent wake-ups

• Reduced idle time, greater power consumption

15

Cpu0:

Cpu1:

intr arrivedTimer intr

CPU idle

CPU busy

idle

Resultant Socket C-state

busy idle busy

intr arrived

Socket:

intr arrived

Page 16: Improving Xen idle power efficiency

Timer alignment

• Proposal

• Configurable timer consolidate window, such as 50 ns

• Compute timer interrupt moment

• Shift timer handle moment to next timer consolidate moment

• Benefit

• Fewer interrupts longer idle time power savings

• Challenges

• Guest schedule impact– performance impact

• Cross CPU timer synchronization

• IPI frequency and synchronization

16

Page 17: Improving Xen idle power efficiency

Timer alignment

17

Cpu0:

Cpu1:

intr arrivedTimer intr

CPU idle

CPU busy

idle

Resultant Socket C-state

busy idle busy

intr arrived

Socket:

intr arrivedNew intr arrived

Gained C-State

• Shifting CPU1’s interrupt to match CPU0’s Nice gain in C-State

• Repeated over and over adds up

Page 18: Improving Xen idle power efficiency

Power aware scheduling

• ACPI modes – − Performance Power hungry mode

− Energy mode Power savings mode

− Balanced

• Task to Scheduling− Performance

−Schedule vCPUs one per physical core before pairing

− Energy−Schedule vCPUs one per logical core

− power down more cores

− power down more sockets

18

Page 19: Improving Xen idle power efficiency

Power saving scheduler

pkg 0 pkg 1

core 0 core 1 core 0 core 1cpu 0 cpu 1 cpu 2 cpu 3 cpu 4 cpu 5 cpu 6 cpu 7

packages

cores

HT

running task vcpu1 vcpu2

Idle CPU/in deep C-state Busy CPU Not in deep C-state

power awarescheduler

19

vcpu0

Page 20: Improving Xen idle power efficiency

Reduce periodic activity

• Power-unfriendly RTC emulation:− VMM updates RTC clock twice per second

− Solution−Update RTC clock only on Read

• Frequent Wake-ups to check buffered I/O:− Wakeup multiple times a second (Polling model)

− Solution (Push model)−Event channel to notify buffered I/O change status

20

If a clock ticks where no one can see it, does the time change?

No more polling

Page 21: Improving Xen idle power efficiency

Server summary

• Significant areas of work

• Need to quantify the impacts

21

Page 22: Improving Xen idle power efficiency

Overall summary

• Every component counts – software and hardware

• Make sure the basics are working

• Still more to do

22

Page 23: Improving Xen idle power efficiency

Questions?

23