1 Copyright 2010 EMC Corporation. All rights reserved. Midwest Regional VMUG Next Generation Best Practices for Storage and VMware Scott Lowe, VCDX #39 vSpecialist, EMC Corporation Author, Mastering VMware vSphere 4 http://blog.scottlowe.org http://twitter.com/scott_lowe
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
The “Great” Protocol Debate• Every protocol can Be Highly Available, and generally, every
protocol can meet a broad performance band• Each protocol has different configuration considerations• Each Protocol has a VMware “super-power”, and also a
“kryponite”• In vSphere, there is core feature equality across protocols
DAS (internal storage to the server)
iSCSI FC FCoE NFS0
10
20
30
40
50
60
70
80
90
100
23
68
93
11
51
Q: What storage protocol(s) support your virtualization
environment?
Source: Virtualgeek 2010 survey – August 2010, 125 resopndents
What’s “out of the box” in vSphere 4.1?[root@esxi ~]# vmware -vVMware ESX 4.1.0 build-260247 [root@esxi ~]# esxcli nmp satp listName Default PSP DescriptionVMW_SATP_SYMM VMW_PSP_FIXED Placeholder (plugin not loaded)VMW_SATP_SVC VMW_PSP_FIXED Placeholder (plugin not loaded)VMW_SATP_MSA VMW_PSP_MRU Placeholder (plugin not loaded)VMW_SATP_LSI VMW_PSP_MRU Placeholder (plugin not loaded)VMW_SATP_INV VMW_PSP_FIXED Placeholder (plugin not loaded)VMW_SATP_EVA VMW_PSP_FIXED Placeholder (plugin not loaded)VMW_SATP_EQL VMW_PSP_FIXED Placeholder (plugin not loaded)VMW_SATP_DEFAULT_AP VMW_PSP_MRU Placeholder (plugin not loaded)VMW_SATP_ALUA_CX VMW_PSP_FIXED_AP Placeholder (plugin not loaded)VMW_SATP_CX VMW_PSP_MRU Supports EMC CX that do not use the ALUA protocolVMW_SATP_ALUA VMW_PSP_RR Supports non-specific arrays that use the ALUA protocolVMW_SATP_DEFAULT_AA VMW_PSP_FIXED Supports non-specific active/active arraysVMW_SATP_LOCAL VMW_PSP_FIXED Supports direct attached devices
MPIO Exceptions – Windows ClustersAmong a long list of “not supported” things:
•NO Clustering on NFS datastores•No Clustering on iSCSI, FCoE (unless using PP/VE)•No round-robin with native multipathing (unless using PP/VE)•NO Mixed environments, such as configurations where one cluster node is running a different version of ESX/ESXi than another cluster node.•NO Use of MSCS in conjunction with VMware Fault Tolerance.•NO Migration with vMotion of clustered virtual machines.•NO N-Port ID Virtualization (NPIV)•You must use hardware version 7 with ESX/ESXi 4.1
General NFS Best Practices – Traditional Ethernet switches Mostly seen with older
1GbE switching platforms
Each switch operates independently
More complex network design
Depends on routing, requires two (or more) IP subnets for datastore trafficMultiple Ethernet options based on Etherchannel capabilities and preferencesSome links may be passive standby links
General NFS Best Practices – Multi-Switch Link Aggregation Allows two physical
switches to operate as a single logical fabric
Much simpler network design
Single IP subnetProvides multiple active connections to each storage controllerEasily scales to more connections by adding NICs and aliasesStorage controller connection load balancing is automatically managed by the EtherChannel IP load-balancing policy
“KISS on Layout”• Use VMFS and NFS together – no reason not to• Strongly consider 10GbE, particularly for new deployments• Avoid RDMs, use “Pools” (VMFS or NFS)• Make the datastores big
– VMFS – make them ~1.9TB in size (2TB – 512 bytes is the max for a single volume), 64TB for a single filesystem
– NFS – make them what you want (16TB is the max)• With vSphere 4.0 and later, you can have many VMs per VMFS
datastore – and VAAI increases this to a non-issue.• On the array, default to Storage Pools, not traditional RAID Groups /
Hypers• Default to single extent VMFS datastores• Default to Thin Provisioning models at the array level, optionally at the
VMware level.– Make sure you enable vCenter managed datastore alerts– Make sure you enable Unisphere/SMC thin provisioning alerts and auto-
expansion• Use “broad” data services – i.e. FAST, FAST Cache (things that are “set
“Use SIOC if you can”• This is a huge vSphere 4.1 feature• “If you can” equals:
– vSphere 4.1, Enterprise Plus– VMFS (NFS targeted for future vSphere releases – not purely a qual)
• Enable it (not on by default), even if you don’t use shares – will ensure no VM swamps the others
• Bonus is you will get guest-level latency alerting!• Default threshold is 30ms
– Leave it at 30ms for 10K/15K, increase to 50ms for 7.2K, decrease to 10ms for SSD
– Fully supported with array auto-tiering - leave it at 30ms for FAST pools • Hard IO limits are handy for View use cases• Some good recommended reading:
• How do I know: – Managed Datastore Reports in vCenter 4.x– Array tools - e.g. Unisphere (vCenter Integration)
Report• What do I do:
– Migrate the VM to a datastore that is configured over a virtually provisioned storage. For VMFS datastore, ESX thin provisioning/compress/dedupe can also be utilized
– For VM on NFS, Data Deduplication can be used via the plug-in to compress the VM when some performance impact is acceptable
5 Exceptions to the rules1. Create “planned datastore designs” (rather than big pools and correct
after the fact) for larger IO use cases (View, SAP, Oracle, Exchange)– Use the VMware + Array Vendor reference architectures.– Generally the cases where > 32 HBA queue & consider > 1 vSCSI
adapters– Over time, SIOC may prove to be a good approach– Some relatively rare cases where large spanned VMFS datastores make
sense2. When NOT to use “datastore pools”, but pRDMs (narrow use cases!)
– MSCS/WSFC– Oracle – pRDMs and NFS can do rapid VtoP with array snapshots
3. When NOT to use NMP Round Robin– Arrays that are not active/active AND use ALUA using only SCSI-2
4. When NOT to use array thin-provisioned devices– Datastores with extremely high amount of small block random IO – In FLARE 30, always use storage pools, LUN migrate to thick devices if
needed5. When NOT to use the vCenter plugins? Trick question – always “yes”
5 Amazing things we’re working on….1. Storage Policy
– How should storage inform vSphere of capabilities and state (and vice versa)– SIOC and Auto-Tiering complement today, how can we integrate?– How can we embed VM-level Encryption?
2. “Bolt-on” vs. “Built for Purpose” using Virtual Appliance constructs– EMC has 3 shipping virtual storage appliances (Atmos/VE, Avamar/VE,
Networker/VE)– Every EMC array is really a cluster of commodity servers with disks– What more could we do to make “bolt on value” easier this way? – “follow the breadcrumb trail”: