Top Banner
© 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork, Ireland
60

© 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

Dec 16, 2015

Download

Documents

Dylan Tucker
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

vSphere Performance Best Practices

Rob Moran

Premier Services Engineer – VMware Global Support Services – Cork, Ireland

Page 2: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

2

Global Support Services and Customer Advocacy

Bangalore, India

Tokyo, Japan

Cork, IrelandBurlington, Canada

Palo Alto, CA Broomfield, CO

Support offices

Local language supportSpanish, Portuguese, French, German, Japanese, Chinese

Global Coverage24x7, 365 days/year

6 Support Centers

1000+ Support Engineers

Follow-the-sunSupport for

Severity 1 Issues

Support Relationships with 100% of the

Fortune 100; 99% of Fortune 500

Page 3: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

3

Customer Support Day Events

Coming to a location near you: sharing of VMware best practices!

• Support Days are a collaboration between VMware Support, Sales and customers – you learn directly from the experts

• Topics are driven by customer input, and typically include:

• Best practices

• Tips/tricks

• Top issues

• Product roadmaps/demos

• Certification offerings

http://www.vmware.com/go/supportdays

Page 4: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

4

Overview

What a performance problem sounds like:

• “My VM is running slow and I don’t know what to do!”

• “I tried adding more memory and CPUs but the problem got worse!”`

• “My VM is slow on one host but fast on another!”

What to look for? Where to start?

We will explore some of the most common performance-related issues that our support centers receive cases for

Page 5: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

5

A word about performance….

Troubleshooting methodology must define:

• How to find root cause

• How to fix the problem

Must answer these questions:

1. How do we know when we are done?

2. Where do we start looking for problems?

3. How do we know what to look for to identify a problem?

4. How do we find the root-cause of a problem we have identified?

5. What do we change to fix the root-cause?

6. Where do we look next if no problem is found?

Page 6: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

6

Agenda

Benchmarking & Tools

Best Practices and Troubleshooting

The 4 “food groups”

• Memory

• CPU

• Storage

• Network

Page 7: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

BENCHMARKING & TOOLS

Page 8: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

8

Benchmarking

Consistent and reproducible results

Important to have base level of acceptable performance

• Expectation vs. Acceptable

Determine baseline of performance prior to deployment

• Benchmark on a physical system if applicable

Avoid subjective metrics, stay quantitative

• “The system seems slower”

• “This worked better last year”

Page 9: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

9

Benchmarking

Benchmarking should be done at the application layer

• Use application-specific benchmarking tools and load generators

• Check with the application vendor

Isolate variables, benchmark optimum situation before introducing load

Understand dependencies

• Human interaction

• Other “food groups”

• Compare apples-to-apples

Page 10: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

10

Aggregates thousands of metrics into Workload, Capacity, Health scores

Self-learns “normal” conditions using patented analytics

Smart alerts of impending performance and capacity degradation

Identifies potential performance problems before they start

Slide 10

Tools – vCenter Operations

Page 11: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

11

Tools – vCenter OperationsSlide 11

Page 12: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

12

Tools – esxtop

Valuable tool built in to vSphere hosts

View or capture real-time data

• View or playback data later

• Import data in 3rd party tools

vSphere Client performance graphs get their data from the kernel and VSI

• Presentation/unit may be different (e.g. %RDY)

Page 13: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

MEMORY

Page 14: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

14

Memory – Overhead

A VM’s RAM is not necessarily machine RAM

• vRAM + overhead = maximum machine RAM

Source: vSphere 5.1 Resource Management Guide• Note: These are estimated values

Page 15: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

15

Memory – Transparent Page Sharing

Page 16: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

16

Memory – Host Memory Management

Occurs when memory is under contention

Ballooning

Compression

Swapping

Page 17: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

17

Memory – Ballooning

Page 18: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

18

Memory – Compression

Page 19: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

19

Memory – Swapping

Page 20: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

20

Memory – Swapping

Page 21: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

21

Memory – VM Resource Allocation

Page 22: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

22

Memory – Resource Pool Allocation

Page 23: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

23

Memory – Ballooning vs. Swapping

Ballooning is better than swapping

Guest can surrender unused/free pages

Guest chooses what to swap, can avoid swapping “hot” pages

Page 24: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

24

Memory – Rightsizing

Generally it is better to OVER-commit than UNDER-commit

If the running VMs are consuming too much host/pool memory…

• Some VMs may not get physical memory

• Ballooning or host swapping

• Higher disk IO

• All VMs slow down

Page 25: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

25

Memory – Rightsizing

If a VM has too little vRAM…

• Applications suffer from lack of RAM

• The guest OS swaps

• Increased disk traffic, thrashing

• SAN slow down as a result of increased disk traffic

If a VM has too much vRAM…

• Higher overhead memory

• Possible decreased failover capacity

• Longer vMotion time

• Larger VSWP file

• Wasted resources

Page 26: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

26

Memory – Troubleshooting

Wrong resource allocation May not notice a limit, e.g. VM or template with a limit gets cloned

Custom share values

Ballooning or swapping at the host level

• Ballooning is a warning sign, not a problem

• Swapping is a performance issue if seen over an extended period

Swapping/paging at the guest level

• Under-provisioned guest memory

Missing balloon driver (Tools)

Page 27: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

27

Memory – Best Practices

Avoid high active host memory over-commitment

• No host swapping occurs when total memory demand is less than the physical memory (Assuming no limits)

Right-size guest memory

• Avoid guest OS swapping

Ensure there is enough vRAM to cover demand peaks

Use a fully automated DRS cluster

• Use Resource Pools with High/Normal/Low shares

• Avoid using custom shares

Page 28: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

CPU

Page 29: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

29

CPU – Overview

Raw processing power of a given host or VM

• Hosts provide CPU resources

• VMs and Resource Pools consume CPU resources

CPU cores/threads need to be shared between VMs

Fair scheduling vCPU time

• Hardware interrupts for a VM

• Parallel processing for SMP VMs

• I/O

Page 30: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

30

CPU – esxtop

Page 31: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

31

CPU – esxtop

Interpret the esxtop columns correctly

%RDY - The percentage of time a VM is ready to run, but no physical processor is ready to run it which may result in decreased performance

%USED – Physical CPU usage

%SYS – Percentage of time in the VMkernel

%RUN – Percentage of total scheduled time to run

%WAIT – Percentage of time in blocked or busy wait states

%IDLE – %WAIT- %IDLE can be used to estimate I/O wait time

Page 32: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

32

CPU – Performance Overhead & Utilization

Different workloads have different overhead costs (%SYS) even for the same utilization (%USED)

CPU virtualization adds varying amounts of system overhead

• Direct execution vs. privileged execution

• Non-paravirtual adapters vs. emulated adaptors

• Virtual hardware (Interrupts!)

• Network and storage I/O

Page 33: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

33

CPU – vSMP

Relaxed Co-Scheduling: vCPUs can run out-of-sync

Idle vCPUs incur a scheduling penalty

• configure only as many vCPUs as needed

• Imposes unnecessary scheduling constraints

Use Uniprocessor VMs for single-threaded applications

Page 34: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

34

CPU– Scheduling

Over committing physical CPUs

VMkernel CPU Scheduler

Page 35: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

35

CPU– Scheduling

Over committing physical CPUs

VMkernel CPU Scheduler

X X

Page 36: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

36

CPU– Scheduling

Over committing physical CPUs

VMkernel CPU Scheduler

X XX X

Page 37: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

37

CPU – Ready Time

The percentage of time that a vCPU is ready to execute, but waiting for physical CPU time

Does not necessarily indicate a problem

• Indicates possible CPU contention or limits

Page 38: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

38

CPU – NUMA nodes

Non-Uniform Memory Access system architecture

Each node consists of CPU cores and memory

A CPU core in one NUMA node can access memory in another node, but at a small performance cost

NUMA node 1 NUMA node 2

Page 39: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

39

CPU – Troubleshooting

vCPU to pCPU over allocation

• HyperThreading does not double CPU capacity!

Limits or too many reservations

• can create artificial limits.

Expecting the same consolidation ratios with different workloads

• Virtualizing “easy” systems first, then expanding to heavier systems

• Compare Apples to Apples

• Frequency, turbo, cache sizes, cache sharing, core count, instruction set…

Page 40: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

40

CPU – Best Practices

Right-size vSMP VMs

Keep heavy-hitters separated

• Fully automated DRS should do this for you

• Use anti-affinity rules if necessary

Use a fully automated DRS cluster

• Test that vMotion works

• Use Resource Pools with High/Normal/Low shares

• Avoid using custom shares

Page 41: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

STORAGE

Page 42: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

42

Storage – esxtop Counters

Different esxtop storage views

• Adapter (d)

• VM (v)

• Disk Device (u)

Key Fields:

• DAVG + KAVG = GAVG

• QUED/USD – Command Queue Depth

• CMDS/s – Commands Per Second

• MBREADS/s

• MBWRTN/s

Page 43: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

43

Storage – Troubleshooting with esxtop

High DAVG: issue beyond the adapter

• bad/overloaded zoning, over utilized storage processors, too few platters in the RAID set, etc.

High KAVG: issue in the kernel storage stack

• Driver issue

• Full queue

Aborts: GAVG exceeding 5000 ms

• Command will be repeated, storage delay for the VM

Page 44: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

44

Storage – Benchmarking with iometer

Page 45: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

45

Storage – Storage I/O Control

Allows the use of Shares per VMDK

Throttling occurs when datastore reaches latency threshold

• Higher share VMDKs perform IO first

vCenter monitors latency across all hosts

• Not effective if datastore shared with other vCenters

Page 46: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

46

Storage – Storage DRS

Datastore clusters

• Maintenance mode

• Anti-affinity rules

vCenter monitors for latency and disk space

• Migrate VMDKs for better performance or utilization

Not effective with automated tiering SANs

• Check HCL to confirm these features are compatible

Page 47: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

47

Storage – Troubleshooting

Snapshots

Excessive traffic down one HBA / Switch / SP can cause latency

• Consider using Round Robin in conjunction with ALUA

• Always be paranoid when it comes to monitoring storage I/O

Consider your I/O patterns

• Peak time for storage IO?

• Virus scans, database maintenance, user logins

Always consult with array vendor

• They know the best practices for their array!

Page 48: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

48

Storage – Best Practices

Use different tiers of storage for different VM workloads

• Slower storage for OS VMDKs

• Faster storage for databases or other high-IO applications

Use the Paravirtual SCSI adapter

• Reduced overhead, higher throughput

Use path balancing where possible, either through 3rd party plugins / Round Robin and ALUA, if supported.

Use Storage DRS with SIOC

• Balance for both free space and latency

• Simplified datastore management

Page 49: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

© 2009 VMware Inc. All rights reserved

NETWORK

Page 50: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

50

Network – Load Balancing

Load balancing defines which uplink is used

• Route based on Port ID

• Route based on IP hash

• Route based on MAC hash

• Route based on NIC load (Load Based Teaming)

Probability of high-bandwidth VMs being on the same physical NIC

Traffic will stay on elected uplink until an event occurs

• NIC link state change, adding/removing NIC from a team, beacon probe timeout…

Page 51: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

51

Network – Troubleshooting

Check counters for NICs and VMs

• Network load imbalance

• 10 Gbps NICs can incur a significant CPU load when running at 100%

Ensure hardware supports TSO

• Use latest drivers and firmware for your NIC on the host

For multi-tier VM applications, use DRS affinity rules to keep VMs on same host

• Same vSwitch / VLAN, rules out physical network

If using Jumbo Frames, ensure it is enabled end-to-end

Page 52: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

52

Network – Best Practices

Use the vmxnet3 virtual adapter

• Less CPU overhead

• 10 Gbps connection to vSwitch

Use the latest driver/firmware for the NICs on the host

Use network shares

• Requires Virtual Distributed Switch 4.1

Isolate vMotion and iSCSI traffic from regular VM traffic

• Separate vSwitches with dedicated NIC(s)

• Most applicable with Gigabit NICs

Page 53: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

53

How to measure the network?

scp from/to ESXi host is not valid check!

With scp we will involve underlying storage on source and destination VM/host

CPU can affect the test, scp will encrypt/decrypt the network flow

Copy to ESXi host can give false result as the management interface has very limited resources

Page 54: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

54

How to check network performance?

VM – VM on same ESXi host. This will exclude physical network problems

VM –VM on different ESXi host. This will involve physical NICs and switch as well

Physical – VM. Will also test physical devices but we can focus on one VM

Physical – Physical: this will give us some number about what to expect

Use iperf/jperf/netperf. Free tool for network test

Page 55: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

55

Iperf

Page 56: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

56

Iperf

Windows and Linux version

Will not use storage

We can use different option for test (UDP/TCP)

Automatically calculates bandwith

Page 57: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

57

In conclusion…

Page 58: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

58

Key Takeaways – Performance Best Practices

Understand your environment

• Hardware, storage, networking

• VMs & applications

Advanced configuration values do not need to be tweaked or modified

• In almost all situations

Use fully automated DRS

Use Paravirtual hardware

Page 59: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

59

Important Links

Page 60: © 2009 VMware Inc. All rights reserved vSphere Performance Best Practices Rob Moran Premier Services Engineer – VMware Global Support Services – Cork,

60

Important Links