Top Banner
34

Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Jan 12, 2016

Download

Documents

Garey Baldwin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.
Page 2: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Improving Disk Latency and Throughput with VMware

Presented byRaxco Software, Inc.

March 11, 2011

Page 3: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Today’s Agenda

• Provide technical information on how NTFS impacts VMware I/O performance

• Examine ESX I/O test results • Economic impact of Windows guests• Solutions

Page 4: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Virtualization Benefits

• Server consolidation• Less physical space for data centers• Lower energy costs• Easier management• Eco-friendly alternative

Page 5: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Identifying and Correcting Problems

• Latency is your best indicator of a performance problem– Device latency is vSphere’s report of the physical storage response time

– Kernel latency is vSphere’s report of ESC’s ability to manage IO

• Experts disagree on specifics, but most agree that… Device latency in excess of 15ms is worth inspection

Device latency in excess of 30ms is likely a problem

Kernel latency in excess of 2ms means ESX queues are overflowing

• High device latency can result in ESX queuing– So, correct slow hardware first!

– Then, consider reducing VMDKs on a VMFS volume

– Only then consider changing queue depths

© Copyright 2010 EMC Corporation. All rights reserved.

Page 6: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Storage Contention Solution: Storage IO Control

• SIOC calculates data store latency to identify storage contention– Latency is a normalized, average across

virtual machines

– IO size and IOPS included

• SIOC enforces fairness when data store latency crosses threshold– Default of 30ms

– Fairness enforced by limiting VMs access to queue slots

• Net effect: trade throughput for latency

© Copyright 2010 EMC Corporation. All rights reserved.

With Storage IO ControlActual Disk Resources utilized by each VM are in

the correct ratio even across ESX Hosts

Page 7: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

NTFS I/O Storms

Page 8: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

NTFS Behavior

• NTFS fragments files and free space• Increases logical I/O to storage controller• More logical I/O = More physical I/O • Multiple instances of Windows on host can

lead to I/O contention

Page 9: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

What is Fragmentation?

??

??

?

?

?

? ?

?

?

?

?

?

??

??

? ?

? ??

??

?

?

?

?

?

Page 10: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Logical v Physical

• Logical Level – NTFS needs disk and cluster size,

enumerates LCNs– Creates $MFT and $Bitmap

metadata– $Bitmap is how NTFS “sees”

the disk – Has no idea about physical/virtual

disk types

Page 11: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Anatomy of an MFT Record

(vcn, lcn, run length): (8a85, 9189a, 7)(vcn, lcn, run length): (8a85, 9189a, 7)

Page 12: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

File Allocation

• Create $MFT record (one or more)• $Bitmap accessed to locate free space • $MFT record is updated with content

CreateBitmapAccess

MFTUpdate

Page 13: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

File Access

• Load portion of MFT with correct record via directory

• Locate file in the MFT• Pass starting LCN’s and run lengths to disk

controller• Number of logical fragments influences

number of physical seeks

Load LocateFile

# LCN’s # PhysicalSeeks

Page 14: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Logical v. Physical

• Physical Level– Disk controller Maps LCNs to PCNs– Writes data to disk

Page 15: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Wasted Seeks

Partition State

Total Number of I/O Requests Sent to the File

System

Total Number of Resulting

Disk Accesses/Seeks

Net Wasted Seeks When

Running SYSmark

Percent Net Wasted Seeks When Running

SYSmark

Fragmented 1,320,686 2,090,649 769,963 58.30%

After PerfectDisk

1,434,454 1,616,847 182,393 12.72%

After Built-In 1,411,613 1,931,395 519,782 36.82%

Page 16: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

How This Affects A Virtual Environment

• P2V Conversion• Extra Hypervisor Overhead• Disk Latency Degradation• Overall Performance • System Throughput• Wasted Space• Costly

Page 17: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

P2V Conversion

Physical Drive

No Optimization Optimization

24GB 24GB 22GB 2GB Smaller

Page 18: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

ESX Cluster Testing

• Identical disks - 40% free space• Optimized one set, the other “as is”• Installed MS Office and MS SQL• Captured metrics with VMware’s vscsiStats

utility

Page 19: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Fragmented PerfectDisk % Improvement

Total IO Count 37191 29238 21.3

Read IO Count 3066 2799 8.7

Write IO Count 34125 26439 22.5

Total I/O Count

Page 20: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

30ms 50ms 100ms >100ms Total

Fragmented

I/O 12749 9877 8700 9116 40,442

PerfectDisk I/O

6707 4923 4081 5053 20,764

49% Reduction in Latency!

Page 21: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Disk Latency

Page 22: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Fragmented Disk PerfectDisk Disk

Total IO Equal to 524K 2512 848

Total IO > 524K 247 2959Read IO Equal to 524K 33 7

Read IO >524K 125 65

Write IO Equal to 524K 2480 841

Write IO >524K 122 2894

12X More Large I/O

Page 23: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

12 times more of the largest IO

Large I/O

Page 24: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Improved Sequential I/O

Fragmented PerfectDisk Improvement

Percent Sequential 17% 27% 58%

Total IO 127703 90526 25%

Sequential IO 22126 24340 33%

Page 25: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Improved Sequential I/O

Page 26: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Installation Time Comparison

Fragmented PerfectDisk % Improvement

MS Office Install 20 min 15 min 25

MS SQL Install 76 min 51 min 33

Page 27: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

The Cost of Fragmentation

EXAMPLE:

• 20 files x 6 seconds = 2 minutes• 300 users x 2 min = 10 hours/day• 10 hrs x $25/hr = $250/day• Annual cost = $62,500

Page 28: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Virtual Guest Fragmentation

• Windows guests have all the same NTFS behavior

• Fragmentation produces more IOPS• Fragmentation reduces ESX throughput• Fragmentation increases ESX disk latency• Fragmentation creates resource contention between

host & guests

Page 29: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Solutions

• Expensive– More disks and faster disks– Upgrade Fibre Channel– Troubleshooting

• Inexpensive– Optimize the Windows guest systems

Page 30: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

PerfectDisk 12 vSphere

• Virtualization Awareness/host & client• OptiWrite Fragmentation Avoidance• “Zero-fill” free space

NEWNEW

NEWNEW

NEWNEW

Page 31: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

PerfectDisk 12 vSphere

• “Short stroking” for thin provisioned disks• Schedule guest compaction• Snapshot & Linked Clone recognition

NEWNEW

NEWNEW

NEWNEW

Page 32: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

PerfectDisk Benefits on ESX

• Saves $$$ in productivity and admin• Reduces resource contention for VM’s• Reduces total IO workload• Improves throughput• Reduces disk latency• Delivers optimal performance

Page 33: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.

Contact Raxco

• Free Evaluation Software• Excellent Support to Get You Started• White Papers • Great ROI• www.raxco.com • Toll Free: 1.800.546.9728

Page 34: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011.