Top Banner
Performance Profiling of Virtual Machines Jiaqing Du + , Nipun Sehrawat * , Willy Zwaenepoel + +EPFL, Switzerland *University of Illinois at Urbana-Champaign
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Performance Profiling of Virtual Machines

Performance Profiling of Virtual Machines

Jiaqing Du+, Nipun Sehrawat*, Willy Zwaenepoel+

+EPFL, Switzerland*University of Illinois at Urbana-Champaign

Page 2: Performance Profiling of Virtual Machines

2

Performance Profiling

• Use CPU performance counters• Monitor software runtime behavior• Incur very low overhead• Used extensively: OProfile, VTune, …

%CYCLE Function Module98.5529 vmx_vcpu_run kvm-intel.ko0.2226 (no symbols) libc.so0.1034 hpet_cpuhp_notify vmlinux0.1034 native_patch vmlinux

Jiaqing Du, VEE, March 9, 2011

Page 3: Performance Profiling of Virtual Machines

3

Terminology

PMUCPU

OS

profiler

PMUCPU

Guest

PMUCPU

Guest

profiler

profilerVMM

profiler

VMM

Jiaqing Du, VEE, March 9, 2011

(1) native profiling (2) guest-wide profiling (3) system-wide profiling

Page 4: Performance Profiling of Virtual Machines

4

Profiling with Virtual Machines

Jiaqing Du, VEE, March 9, 2011

Para-virtualization

Hardware assistance

Binary translation

Guest-wide profiling ? ? ?System-wide profiling ? ?

Profilers do not work well with virtual machines.

XenOprof

Page 5: Performance Profiling of Virtual Machines

5

Contributions

Jiaqing Du, VEE, March 9, 2011

Para-virtualization

Hardware assistance

Binary translation

Guest-wide profiling ? ? ?System-wide profiling ? ?

XenOprof

(1) Give solutions

(2) Implement prototypes

Page 6: Performance Profiling of Virtual Machines

6

Outline

• Native profiling• Guest-wide profiling• System-wide profiling• Evaluation

Jiaqing Du, VEE, March 9, 2011

Page 7: Performance Profiling of Virtual Machines

7

Native Profiling

• Performance monitoring unit (PMU)– consists of a set of event counters– generates an interrupt when a counter overflows

• PMU-based profiler

PMUCPU

Kernel

UserControl

CollectConfigure

Interpret

Jiaqing Du, VEE, March 9, 2011

- previous PC value- process identifier

Page 8: Performance Profiling of Virtual Machines

8

Guest-wide Profiling

• Profiler runs in the guest and only profiles the guest

Jiaqing Du, VEE, March 9, 2011

PMUCPU

GuestControl

CollectConfigure

Interpret

VMM

Challenge: synchronous interrupt delivery to the guest

Injected interrupts should be handled right after guest resumes execution.

Page 9: Performance Profiling of Virtual Machines

9

System-wide Profiling (1/3)

• Reveal runtime behavior of both VMM and guest(s)

Jiaqing Du, VEE, March 9, 2011

PMUCPU

Control

CollectConfigure

Interpret

VMM

Guest1 Guest2

Challenge: interpret samples belonging to the guest

Do not know the internals of a guest.

Page 10: Performance Profiling of Virtual Machines

10

System-wide Profiling (2/3)

• Interpret guest samples: full delegation

Jiaqing Du, VEE, March 9, 2011

PMUCPU

Control

CollectConfigure

Interpret

VMM

Guest

Control

CollectConfigure

Interpret

Page 11: Performance Profiling of Virtual Machines

11

System-wide Profiling (3/3)

• Interpret guest samples: interpretation delegation

Jiaqing Du, VEE, March 9, 2011

PMUCPU

Control

CollectConfigure

Interpret

VMM

Guest

Control

CollectConfigure

Interpret

SharedBuffer

Page 12: Performance Profiling of Virtual Machines

12

• When to save & restore performance counters?• CPU switch– only in-guest execution is accounted to the guest

• Domain switch– in-VMM execution is also accounted to the guest

PMU Multiplexing

Jiaqing Du, VEE, March 9, 2011

account to guest 1

guest2I/Oguest1

VMMguest1 I/Oguest2

VMMguest2

account to guest 2 account to guest 2

guest2I/Oguest1

VMMguest1 I/Oguest2

VMMguest2

account to guest1 account to guest2

Page 13: Performance Profiling of Virtual Machines

13

Implementation

Jiaqing Du, VEE, March 9, 2011

Para-virtualization

Guest-wide profiling ? √ ?System-wide profiling √ √

XenOprof

KVM QEMU

Page 14: Performance Profiling of Virtual Machines

14

Evaluation question #1

How much does profiling slow down programs?

Jiaqing Du, VEE, March 9, 2011

Page 15: Performance Profiling of Virtual Machines

15

Profiling Overhead

• Measure execution time– a computation-intensive program– with and without profiling– about 400 counter overflows per second

Jiaqing Du, VEE, March 9, 2011

Profiling environment Increased execution time

Native Linux 0.04% ± 0.004%

KVM guest-wide 0.39% ± 0.045%

KVM system-wide 0.44% ± 0.043%

QEMU system-wide 0.94% ± 0.044%

Page 16: Performance Profiling of Virtual Machines

16

Evaluation question #2

Are profiling results accurate?

Jiaqing Du, VEE, March 9, 2011

Page 17: Performance Profiling of Virtual Machines

17

Profiling Accuracy (1/4)

• A computation-intensive benchmark• compute_{a|b}() does floating point arithmetic• Monitor CPU cycles

Jiaqing Du, VEE, March 9, 2011

int main(int argc, char *argv[]){ while (1) { compute_a(); compute_b(); }}

Page 18: Performance Profiling of Virtual Machines

18

Profiling Accuracy (2/4)

• Comparison with native profiling

Jiaqing Du, VEE, March 9, 2011

compute_a compute_b0

10

20

30

40

50

60

70

80

90

NativeKVM guest-wideKVM system-wideQEMU system-wide

Cycle %

Routine name

Page 19: Performance Profiling of Virtual Machines

19

Profiling Accuracy (3/4)

• A memory-intensive benchmark• Randomly access a fixed-size region of memory• Monitor last level cache misses

Jiaqing Du, VEE, March 9, 2011

struct item { struct item *next; long pad[NUM_PAD];}

void chase_pointer(){ struct item *p = NULL; p = &randomly_connected_items; while (p != null) p = p->next;}

Page 20: Performance Profiling of Virtual Machines

20

Profiling Accuracy (4/4)

• Comparison with native profiling

Jiaqing Du, VEE, March 9, 2011

256 512 768 1024 1280 1536 1792 2048 2304 2560 2816 30720

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

NativeKVM guest-wideKVM system-wideQEMU system-wide

Cache misses permemory access

Working set size (KB)

Page 21: Performance Profiling of Virtual Machines

21

Evaluation question #3

What is the difference betweenCPU switch and domain switch?

Jiaqing Du, VEE, March 9, 2011

Page 22: Performance Profiling of Virtual Machines

22

• CPU switch

• Domain switch

Recap

Jiaqing Du, VEE, March 9, 2011

account to guest 1

guest2I/Oguest1

VMMguest1 I/Oguest2

VMMguest2

account to guest 2 account to guest 2

guest2I/Oguest1

VMMguest1 I/Oguest2

VMMguest2

account to guest1 account to guest2

Page 23: Performance Profiling of Virtual Machines

23

Profiling Packet Receive (1/2)

• Experiment– push packets to a Linux guest in KVM– run OProfile in the guest– monitor instruction retirements

Linux

NICHardware

KVM virtual NIC

NICHardware

Linux

Jiaqing Du, VEE, March 9, 2011

Page 24: Performance Profiling of Virtual Machines

24

INSTR Function

2261 cp_interrupt

1336 cp_rx_poll

1034 cp_start_xmit

421 native_apic_mem_write

374 native_apic_mem_read

… …

… …

… …

… …

… …

INSTR Function

2261 cp_interrupt

1336 cp_rx_poll

1034 cp_start_xmit

421 native_apic_mem_write

374 native_apic_mem_read

191 csum_partial

105 csum_partial_copy_generic

94 copy_to_user

79 ipt_do_table

51 tcp_v4_rcv

Profiling Packet Receive (2/2)

INSTR Function

167 csum_partial

106 csum_partial_copy_generic

74 copy_to_user

47 ipt_do_table

38 tcp_v4_rcv

… …

… …

… …

… …

… …

CPU Switch Domain Switch

Jiaqing Du, VEE, March 9, 2011

Domain switch gives more insight for I/O operations.

PacketProcessing

I/ORelated

Page 25: Performance Profiling of Virtual Machines

25

Related Work

• XenOprof– first profiler targeting virtual machines– system-wide profiling for Xen

• Linux perf– a profiling infrastructure for Linux– limited support of profiling KVM Linux guest

• VMware vmkperf– only read and write CPU performance counters

Jiaqing Du, VEE, March 9, 2011

Page 26: Performance Profiling of Virtual Machines

26

Conclusions

Jiaqing Du, VEE, March 9, 2011

Para-virtualization

Hardware assistance

Binary translation

Guest-wide profiling

√ √ √

System-wide profiling √ √

XenOprof