Top Banner
SAN FRANCISCO, CA, USA Adaptive Energy- Adaptive Energy- efficient Resource efficient Resource Sharing for Sharing for Multi-threaded Workloads Multi-threaded Workloads in Virtualized Systems in Virtualized Systems Can Hankendi Ayse K. Coskun Boston University Electrical and Computer Engineering Department This project has been partially funded by:
21

Adaptive Energy -efficient Resource Sharing for Multi-threaded Workloads in Virtualized Systems

Jan 25, 2016

Download

Documents

Cindy

Adaptive Energy -efficient Resource Sharing for Multi-threaded Workloads in Virtualized Systems. Can Hankendi Ayse K. Coskun Boston University Electrical and Computer Engineering Department. This project has been partially funded by:. Energy Efficiency in C o mputing Clusters. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

SAN FRANCISCO, CA, USA

Adaptive Energy-efficient Adaptive Energy-efficient Resource Sharing for Resource Sharing for Multi-threaded Workloads Multi-threaded Workloads in Virtualized Systemsin Virtualized Systems

Can Hankendi Ayse K. Coskun

Boston UniversityElectrical and Computer Engineering

Department

This project has been partially funded by:

Page 2: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Energy Efficiency in CEnergy Efficiency in Coomputing Clustersmputing Clusters

• Energy-related costs are among the biggest contributors to the total cost of ownership.

• Consolidating multiple workloads on the same physical node improves energy efficiency.

2

(Source: International Data Corporation (IDC), 2009)

Page 3: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Multi-threaded Applications in the Multi-threaded Applications in the CloudCloud

• HPC applications are expected to shift towards cloud resources.

• Resource allocation decisions significantly affect the energy efficiency of server nodes.

• Energy efficiency is a function of application characteristics.

3

Page 4: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

OutlineOutline

• Background

• Methodology

• Adaptive Resource Sharing

• Results

• Conclusions

4

Page 5: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

BackgroundBackground

Cluster-level VM Management

- Consolidation policies across server nodes

- VM migration techniques

[Srikantaiah, HotPower’08][Bonvin, CCGrid’11]

Node-level Management

Recent Co-scheduling policies-Co-scheduling contrasting workloads-Balancing performance events across nodes

- Cache misses- IPC- Bus accesses

[Dhiman, ISLPED’09][Bhadauria, ICS’10]

- Co-scheduling based on thread communication

- Identifying best thread mixes to co-schedule

[Frachtenberg, TPDS’05][McGregor, IPDPS’05]

5

Page 6: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Virtualized System SetupVirtualized System Setup

• 12-core AMD Magny Cours Server 2x 6-core dies attached side by side in

the same package Private L1 and L2-caches for each core 6 MB shared L3-cache for each 6-core die

6

• Virtualized through VMware vSphere 5 ESXi hypervisor 2 Virtual Machines (VM) with Ubuntu Server Guest OS

Page 7: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Methodology: Measurement Methodology: Measurement Setup Setup

• System-level power measurements at 1s sampling rate• Performance counter collection through vmkperf at 1s

sampling rate Counters: CPU cycles, retired instructions, L3-cache

misses• VM-level CPU and memory utilization data collection through

esxtop with 2s sampling rate

System-level power measurement

Logger

esxtop

vmkperf

7

Page 8: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Parallel WorkloadsParallel Workloads

• PARSEC 2.1 benchmark suite [Bienia et al., 2008]

Benchmark Application IPC Memory Acc.

blackscholes Financial Analysis Low Low

bodytrack Computer Vision High Medium

canneal VLSI Design Low High

dedup Enterprise Storage Medium Low

ferret Similarity Search Medium Low

freqmine Data Mining High Low

swaptions Financial Analysis High Low

streamcluster Data Mining Low High

vips Media Processing High Low

x264 Media Processing Medium Medium

8

Page 9: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Tracking Parallel PhasesTracking Parallel Phases

• consolmgmt• Consolidation management interface• Synchronizes ROI (region-of-interest) of

multiple workloadsconsolmgmt

parsecmgmt

hooks.c

roi-Trigger()start-Logging

Input (Serial) Output (Serial)

Input (Serial) Output (Serial)

Benchmark A

Benchmark B

sleep()

start-Logging() end-Logging()

roi-Trigger()

9

Page 10: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Performance Impact of Consolidation Performance Impact of Consolidation • Consolidating multiple workloads can degrade performance

due to resource contention.• Virtualization provides performance isolation by managing

memory and NUMA node affinities.• With native OS, performance variation is 2.5x higher.

10

Average throughput of Streamcluster when co-scheduled with another PARSEC benchmark

Page 11: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

OutlineOutline

• Background

• Methodology

• Adaptive Resource Sharing

• Results

• Conclusions

11

Page 12: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Impact of Application SelectionImpact of Application Selection

• Previous co-scheduling policies focus on application selection to improve energy efficiency.

• Application selection is based on balancing memory operations and CPU usage.

12

A B C D

Page 13: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Predicting Power EfficiencyPredicting Power Efficiency

• To improve the energy efficiency, we need to allocate more CPU resources to power-efficient workloads.

• IPC*CPU Utilization metric shows strong correlation with power efficiency.

13

IPC

*CP

U U

tiliz

atio

n

Page 14: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

• IPC*CPU Utilization metric is used to classify applications according to their power efficiency levels.

• We utilize density based clustering algorithm (DBSCAN) to determine application groups based on their power efficiency classes.

Application ClassificationApplication Classification

14

Page 15: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

• IPC*CPU Utilization metric is used to classify applications according to their power efficiency levels.

• We utilize density based clustering algorithm (DBSCAN) to determine application groups based on their power efficiency classes.

Application ClassificationApplication Classification

Case 2VM1VM1

ESXi ESXi

VM0VM0

VM1

ESXi ESXi

VM0

Benchmarks

Case 1

VM Configuration

15

Page 16: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Reconfiguring Resource AllocationsReconfiguring Resource Allocations• CPU hot-plugging:

Adding/removing vCPUs during runtime Cons: Removing vCPU is not supported in some OSes

• Resource Allocation Adjustment: Allocating/limiting CPU resources for VMs Pros: Fine granularity (resource allocation unit is MHz)

• Both techniques have low overhead, less than 1%.

16

Resource Configuration Comparison

Page 17: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Reconfiguration Runtime BehaviorReconfiguration Runtime Behavior

• Resource allocation limits can be dynamically adjusted according to application classes.

• CPU allocation limits can be effectively reconfigured within a second.

17

Page 18: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

ResultsResults

• Proposed approach improves throughput-per-watt by up to 25% and by 9% on average.

18

Page 19: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

ResultsResults

• We generate 50 workload sets, each consists of randomly selected 10 PARSEC applications.

19

Set 2

3x canneal3x ferret2x bodytrack1x dedup1x vips

Set 1

4x blackscholes2x vips1x bodytrack1x freqmine1x streamcluster1x swaptions

Page 20: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

ResultsResults

• We generate 50 workload sets, each consists of randomly selected 10 PARSEC applications.

• Proposed resource sharing technique improves the throughput-per-watt by 12% on average in comparison to application selection based co-scheduling techniques.

20

Page 21: Adaptive Energy -efficient Resource Sharing for  Multi-threaded Workloads in Virtualized  Systems

Computing in Heterogeneous, Autonomous 'N' Goal-oriented Environments

Conclusions & Future WorkConclusions & Future Work

• Consolidation is a powerful technique to improve the energy efficiency on data centers.

• Energy efficiency of parallel workloads varies significantly depending on application characteristics.

• Adaptive VM configuration for parallel workloads improves the energy efficiency by 12% on average over existing co-scheduling algorithms.

• Future research directions include:Investigating the effect of memory allocation decisions on energy efficiency;Utilizing application-level instrumentation to explore power/energy optimization opportunities;Expanding the application space.

21