Module 3 - vStorage Cormac Hogan Product Support Engineering Rev P Last updated 23 rd March 2009 VMware Confidential
Apr 01, 2015
Module 3 - vStorage
Cormac Hogan
Product Support EngineeringRev P
Last updated 23rd March 2009
VMware Confidential
2VI4 - Mod 3 - Slide
Agenda
Module 0 - Product Overview
Module 1 - VI Installation-Upgrade
Module 2 - vCenter
Module 3 - vStorage
Module 4 - Networking
3VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
4VI4 - Mod 3 - Slide
Introduction
Before we begin, I want to bring to your attention some very new device naming conventions in ESX 4.
Although the vmhbaN:C:T:L:P naming convention is visible, it is now known as the run-time name and is no longer guaranteed to be persistent through reboots.
ESX 4 now uses the unique LUN identifiers, typically the NAA (Network Addressing Authority) id. This is true for the CLI as well as the GUI and is also the naming convention used during the install.
The IQN (iSCSI Qualified Name) is still used for iSCSI targets.
The WWN (World Wide Name) is still used for Fiber Channel targets.
For those devices which do not have a unique id, you will observe an MPX reference (which is basically stands for VMware Multipath X device).
5VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
6VI4 - Mod 3 - Slide
Pluggable Storage Architecture
PSA, the Pluggable Storage Architecture, is a collection of VMkernel APIs that allow third party hardware vendors to insert code directly into the ESX storage I/O path.
This allows 3rd party software developers to design their own load balancing techniques and failover mechanisms for particular storage array types.
This also means that 3rd party vendors can now add support for new arrays into ESX without having to provide internal information or intellectual property about the array to VMware.
VMware, by default, provide a generic Multipathing Plugin (MPP) called NMP (Native Multipathing Plugin).
PSA co-ordinates the operation of the NMP and any additional 3rd party MPP.
7VI4 - Mod 3 - Slide
Pass Thru RDM COW
Raw Disk
Flat
Logical Device IO Scheduler
Linux Emulation
SCSI Disk Emulation CONF IGURAT ION
Adapter IO Scheduler
VMKernel
Non Pass Thru RDM
Guest OS
Scsi HBA Emulation
VM
Guest OS
Scsi HBA Emulation
VM
Memory Disk SCSI Disk
SCSI Device Driver FC Device Driver iSCSI Driver
Console OS
VMnix
Config Util Agent
SCSI Mid-Layer: Multipathing,lun discovery, path masking, and path policy code.
VMFS3 VMFS2
Filesystem Switch
NFS
Pluggable Storage Architecture (ctd)
ESX 3ESX 4
VMFS2 VMFS3
Pass Thru RDM COW
Raw Disk
Flat
Filesystem Switch
Logical Device IO Scheduler
Linux Emulation
SCSI Disk Emulation
PSA
MPP
CONF IGURAT ION
Adapter IO Scheduler
Scanning
VMKernel
NFS
Non Pass Thru RDM
Guest OS
Scsi HBA Emulation
VM
Guest OS
Scsi HBA Emulation
VM
Memory Disk SCSI Disk
SCSI Device Driver FC Device Driver iSCSI Driver
Console OS
VMnix
Config Util Agent
8VI4 - Mod 3 - Slide
PSA Tasks
Loads and unloads multipathing plugins (MPPs).
Handles physical path discovery and removal (via scanning).
Routes I/O requests for a specific logical device to an appropriate MPP.
Handles I/O queuing to the physical storage HBAs & to the logical devices.
Implements logical device bandwidth sharing between Virtual Machines.
Provides logical device and physical path I/O statistics.
9VI4 - Mod 3 - Slide
Native Multipathing Plugin – NMP
VMFS2 VMFS3
Pass Thru RDM
COW
Raw Disk
Flat
Filesystem Switch
Logical Device IO Scheduler
Device Drivers
SCSI Disk Emulation
PSA Framework
CONF I
GURAT ION
Linux Emulation
Scanning
VMKernel
NFS
Non Pass Thru RDM
Guest OS Scsi Emulation VM
Memory Disk SCSI Disk
Guest OS Scsi Emulation VM
Multi Pathing Plugin
NMP PSP SATP
NMP is VMware’s Native Multipathing plugin in ESX 4.0.
NMP supports all storage arrays listed on the VMware storage Hardware Compatability List (HCL).
NMP manages sub-plugins for handling multipathing and load balancing.
PSA sits in the SCSI mid-layer of the VMkernel I/O stack
10
VI4 - Mod 3 - Slide
MPP Tasks
The PSA discovers available storage paths and based on a set of predefined rules, the PSA will determine which MPP should be given ownership of the path.
The MPP then associates a set of physical paths with a specific logical device.
The specific details of handling path failover for a given storage array are delegated to a sub-plugin called a Storage Array Type Plugin (SATP).
SATP is associated with paths.
The specific details for determining which physical path is used to issue an I/O request (load balancing) to a storage device are handled by a sub-plugin called Path Selection Plugin (PSP).
PSP is associated with logical devices.
11
VI4 - Mod 3 - Slide
NMP Specific Tasks
Manage physical path claiming and unclaiming.
Register and de-register logical devices.
Associate physical paths with logical devices.
Process I/O requests to logical devices:
Select an optimal physical path for the request (load balance)
Perform actions necessary to handle failures and request retries.
Support management tasks such as abort or reset of logical devices.
12
VI4 - Mod 3 - Slide
NMP
•NMP
SATP FASTT
SATP CX
SATP EVA
PSP FIXED
PSP RR
VMFS2 VMFS3
Pass Thru RDM COW
Raw Disk
Flat
Filesystem Switch
Logical Device IO Scheduler
Linux Emulation
SCSI Disk Emulation
PSA
MPP
CONF IGURAT ION
Adapter IO Scheduler
Scanning
VMKernel
NFS
Non Pass Thru RDM
Guest OS
Scsi HBA Emulation
VM
Guest OS
Scsi HBA Emulation
VM
Memory Disk SCSI Disk
SCSI Device Driver FC Device Driver iSCSI Driver
Console OS
VMnix
Config Util Agent
VMFS2 VMFS3
Pass Thru RDM COW
Raw Disk
Flat
Filesystem Switch
Logical Device IO Scheduler
Linux Emulation
SCSI Disk Emulation
PSA
LegacyMP
CONF IGURAT ION
Adapter IO Scheduler
Scanning
VMKernel
NFS
Non Pass Thru RDM
Guest OS
Scsi HBA Emulation
VM
Guest OS
Scsi HBA Emulation
VM
Memory Disk SCSI Disk
SCSI Device Driver FC Device Driver iSCSI Driver
Console OS
VMnix
Config Util Agent ESX 3.5 LegacyMPESX 4.0 Native Multipathing Plug-in
(NMP)
LegacyMP
•Jumble of array specific code and
path policies
13
VI4 - Mod 3 - Slide
Storage Array Type Plugin - SATP
An Storage Array Type Plugin (SATP) handles path failover operations.
VMware provides a default SATP for each supported array as well as a generic SATP (an active/active version and an active/passive version) for non-specified storage arrays.
If you want to take advantage of certain storage specific characteristics of your array, you can install a 3rd party SATP provided by the vendor of the storage array, or by a software company specializing in optimizing the use of your storage array.
Each SATP implements the support for a specific type of storage array, e.g. VMW_SATP_SVC for IBM SVC.
14
VI4 - Mod 3 - Slide
SATP (ctd)
The primary functions of an SATP are:
Implements the switching of physical paths to the array when a path has failed.
Determines when a hardware component of a physical path has failed.
Monitors the hardware state of the physical paths to the storage array.
There are many storage array type plug-ins. To see the complete list, you can use the following commands:
# esxcli nmp satp list
# esxcli nmp satp listrules
# esxcli nmp satp listrules –s <specific SATP>
15
VI4 - Mod 3 - Slide
SATP (ctd)List the defined Storage Array Type Plugins (SATP) for the VMware Native Multipath Plugin (NMP).
# esxcli nmp satp listName Default PSP Description
VMW_SATP_ALUA_CX VMW_PSP_FIXED Supports EMC CX that use the ALUA protocol
VMW_SATP_SVC VMW_PSP_FIXED Supports IBM SVC
VMW_SATP_MSA VMW_PSP_MRU Supports HP MSA
VMW_SATP_EQL VMW_PSP_FIXED Supports EqualLogic arrays
VMW_SATP_INV VMW_PSP_FIXED Supports EMC Invista
VMW_SATP_SYMM VMW_PSP_FIXED Supports EMC Symmetrix
VMW_SATP_LSI VMW_PSP_MRU Supports LSI and other arrays compatible
with the SIS 6.10 in non-AVT mode
VMW_SATP_EVA VMW_PSP_FIXED Supports HP EVA
VMW_SATP_DEFAULT_AP VMW_PSP_MRU Supports non-specific active/passive arrays
VMW_SATP_CX VMW_PSP_MRU Supports EMC CX that do not use the ALUA
protocol
VMW_SATP_ALUA VMW_PSP_MRU Supports non-specific arrays that use the
ALUA protocol
VMW_SATP_DEFAULT_AA VMW_PSP_FIXED Supports non-specific active/active arrays
VMW_SATP_LOCAL VMW_PSP_FIXED Supports direct attached devices
16
VI4 - Mod 3 - Slide
SATP (ctd)
To filter the rules to a specific SATP:
# esxcli nmp satp listrules -s VMW_SATP_EVAName Vendor Model Driver Options Claim Options Description
VMW_SATP_EVA HSV101 tpgs_off active/active EVA 3000 GL
VMW_SATP_EVA HSV111 tpgs_off active/active EVA 5000 GL
VMW_SATP_EVA HSV200 tpgs_off active/active EVA 4000/6000 XL
VMW_SATP_EVA HSV210 tpgs_off active/active EVA 8000/8100 XL
This shows us all the models of controller/array in the EVA series from HP which are associated with the SATP_EVA Storage Array Type Plug-in.
17
VI4 - Mod 3 - Slide
Path Selection Plugin (PSP)
If you want to take advantage of more complex I/O load balancing algorithms, you could install a 3rd party Path Selection Plugin (PSP).
A PSP handles load balancing operations and is responsible for choosing a physical path to issue an I/O request to a logical device.
VMware provide three PSP: Fixed, MRU or Round Robin.
# esxcli nmp psp list
Name Description
VMW_PSP_MRU Most Recently Used Path Selection
VMW_PSP_RR Round Robin Path Selection
VMW_PSP_FIXED Fixed Path Selection
18
VI4 - Mod 3 - Slide
NMP Supported PSPs
Most Recently Used (MRU) — Selects the first working path discovered at system boot time. If this path becomes unavailable, the ESX host switches to an alternative path and continues to use the new path while it is available.
Fixed — Uses the designated preferred path, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the ESX host cannot use the preferred path, it selects a random alternative available path. The ESX host automatically reverts back to the preferred path as soon as the path becomes available.
Round Robin (RR) – Uses an automatic path selection rotating through all available paths and enabling load balancing across the paths.
19
VI4 - Mod 3 - Slide
NMP I/O Flow
When a Virtual Machine issues an I/O request to a logical device managed by the NMP, the following steps take place:
The NMP calls the PSP assigned to this logical device.
The PSP selects an appropriate physical path to send the I/O.
Load balancing the I/O if necessary.
If the I/O operation is successful, the NMP reports its completion.
If the I/O operation reports an error, the NMP calls an appropriate SATP.
The SATP interprets the error codes and, when appropriate, activates inactive paths and fails over to the new active path.
The PSP is then called to select a new active path from the available paths to send the I/O.
20
VI4 - Mod 3 - Slide
Emulation & Drivers
PSA
NMPSATP PSP
Emulation & FS Switch
ActiveStandbyActive Dead
NMP I/O Flow (ctd)
HBA 1 HBA 2
21
VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs
By default, logging is minimal in ESX 4.0 RC.
The following test disables the 2 active paths to a LUN.
# esxcfg-mpath -s off -d naa.600601601d311f00e93e751b93b4dd11 --path vmhba3:C0:T1:L4
# esxcfg-mpath -s off -d naa.600601601d311f00e93e751b93b4dd11 --path vmhba2:C0:T1:L4
The message in the logs to indicate that a ‘failover’ has occurred is:
Nov 19 17:25:17 cs-tse-h97 vmkernel: 2:05:02:23.559 cpu5:4111)NMP: nmp_HasMoreWorkingPaths: STANDBY path(s) only to device "naa.600601601d311f00e93e751b93b4dd11".
22
VI4 - Mod 3 - Slide
Enabling Additional Logging on ESX 4.0
For additional SCSI Log Messages, set:
Scsi.LogCmdErrors = "1“
Scsi.LogMPCmdErrors = "1“
These can be found in the Advanced Settings.
23
VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs With Additional Logging14:13:47:59.878 cpu7:4109)NMP: nmp_HasMoreWorkingPaths: STANDBY path(s) only to device "naa.60060160432017005c97aea1b32fdc11".
14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_PspSelectPathForIO: Plugin VMW_PSP_MRU selectPath() returned path "vmhba0:C0:T1:L1" for device "naa.60060160432017005c97aea1b32fdc11" which is in state standby instead of ON. Status is Bad parameter
14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_SelectPathAndIssueCommand: PSP select path "vmhba0:C0:T1:L1" in a bad state on device "naa.60060160432017005c97aea1b32fdc11".
14:13:47:59.887 cpu7:4374)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004237c00) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba0:C0:T1:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160432017005c97aea1b32fdc11": awaiting fast path state update for failover with I/O blocked...
14:13:47:59.887 cpu7:4374)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "naa.60060160432017005c97aea1b32fdc11" is blocked. Not starting I/O from device.
14:13:48:00.069 cpu7:4109)NMP: nmp_DeviceUpdatePathStates: Activated path "vmhba0:C0:T1:L1" for NMP device "naa.60060160432017005c97aea1b32fdc11".
14:13:48:00.888 cpu1:4206)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa.60060160432017005c97aea1b32fdc11" - issuing command 0x410004237c00
14:13:48:00.888 cpu2:4373)WARNING: NMP: nmp_CompleteRetryForPath: Retry command 0x2a (0x410004237c00) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba0:C0:T1:L1" H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0.
14:13:48:00.888 cpu2:4373)WARNING: NMP: nmp_CompleteRetryForPath: Retry world restored device "naa.60060160432017005c97aea1b32fdc11" - no more commands to retry
14:13:48:00.888 cpu2:4373)ScsiDeviceIO: 746: Command 0x2a to device "naa.60060160432017005c97aea1b32fdc11" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0.
24
VI4 - Mod 3 - Slide
ESX 4.0 Failover Logs – FC cable unplugged
14:13:32:16.716 cpu3:4099)<6>qla2xxx 003:00.1: LOOP DOWN detected mbx1=2h mbx2=5h mbx3=0h.
14:13:32:24.425 cpu6:4195)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004286980) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x5 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.
14:13:32:24.425 cpu6:4195)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe: NMP device "naa.60060160432017005c97aea1b32fdc11" state in doubt; requesting fast path state update...
14:13:32:24.425 cpu6:4195)ScsiDeviceIO: 746: Command 0x2a to device "naa.60060160432017005c97aea1b32fdc11" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.
14:13:32:26.718 cpu4:4198)<3> rport-4:0-0: blocked FC remote port time out: saving binding
14:13:32:26.718 cpu4:4198)<3> rport-4:0-1: blocked FC remote port time out: saving binding
14:13:32:26.718 cpu5:4101)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x410004286980) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
14:13:32:26.718 cpu5:4101)WARNING: NMP: nmp_DeviceRetryCommand: Device "naa.60060160432017005c97aea1b32fdc11": awaiting fast path state update for failover with I/O blocked...
14:13:32:26.718 cpu5:4101)NMP: nmp_CompleteCommandForPath: Command 0x2a (0x4100042423c0) to NMP device "naa.60060160432017005c97aea1b32fdc11" failed on physical path "vmhba1:C0:T0:L1" H:0x1 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
14:13:32:26.718 cpu3:4281)WARNING: VMW_SATP_CX: satp_cx_otherSPIsHung: Path "vmhba1:C0:T1:L1" MODE SENSE PEER SP command failed 0/1 0x0 0x0 0x0.
14:13:32:26.719 cpu1:4206)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa.60060160432017005c97aea1b32fdc11" - issuing command 0x410004286980
14:13:32:26.752 cpu2:4237)NMP: nmp_CompleteRetryForPath: Retry world recovered device "naa.60060160432017005c97aea1b32fdc11"
25
VI4 - Mod 3 - Slide
VMkernel Modules
# vmkload_mod -l | grep satp
vmw_satp_local 0x418017811000 0x1000 0x417fd8676270 0x1000 10 Yes
vmw_satp_default_aa 0x418017812000 0x1000 0x417fd8680e80 0x1000 11 Yes
vmw_satp_alua 0x41801783c000 0x4000 0x417fd8684460 0x1000 17 Yes
vmw_satp_cx 0x418017840000 0x6000 0x417fd868d9b0 0x1000 18 Yes
vmw_satp_default_ap 0x418017846000 0x2000 0x417fd868e9c0 0x1000 19 Yes
vmw_satp_eva 0x418017848000 0x2000 0x417fd868f9d0 0x1000 20 Yes
vmw_satp_lsi 0x41801784a000 0x4000 0x417fd86909e0 0x1000 21 Yes
vmw_satp_symm 0x41801784e000 0x1000 0x417fd86919f0 0x1000 22 Yes
vmw_satp_inv 0x41801784f000 0x3000 0x417fd8692a00 0x1000 23 Yes
vmw_satp_eql 0x418017852000 0x1000 0x417fd8693a10 0x1000 24 Yes
vmw_satp_msa 0x418017853000 0x1000 0x417fd8694a20 0x1000 25 Yes
vmw_satp_svc 0x418017854000 0x1000 0x417fd8695a30 0x1000 26 Yes
vmw_satp_alua_cx 0x418017855000 0x3000 0x417fd8696a40 0x1000 27 Yes
# vmkload_mod -l | grep psp
vmw_psp_fixed 0x418017813000 0x2000 0x417fd8681e90 0x1000 12 Yes
vmw_psp_rr 0x418017858000 0x3000 0x417fd8697a80 0x1000 28 Yes
vmw_psp_mru 0x41801785b000 0x2000 0x417fd8698aa0 0x1000 29 Yes
There is no equivalent to the vmkload_mod command in the VI CLI 4.0. To list this information on ESXi, use the vicfg-module –l (list) RCLI command.
26
VI4 - Mod 3 - Slide
PSA and NMP Terminology & Concepts
An MPP “claims” a physical path and “manages” or “exports” a logical device.
Only the MPP can associate a physical path with a logical device.
Which MPP claims the path is decided by a set of PSA rules.
All rules for the plugins and sub-plugins are stored in the /etc/vmware/esx.conf file on the ESX/ESXi server.
If the MPP is the NMP from VMware, then:
NMP “associates” an SATP with a path from a given type of array.
NMP “associates” a PSP with a logical device.
NMP specifies a default PSP for every logical device based on the SATP associated with the physical paths for that device.
NMP allows the default PSP for a device to be overridden.
27
VI4 - Mod 3 - Slide
Viewing Plugin InformationThe following command lists all multipathing modules loaded on the system. At a minimum, this command returns the default VMware Native Multipath (NMP) plugin & the MASK_PATH plugin. Third-party MPPs will also be listed if installed:
# esxcfg-mpath -G
MASK_PATH
NMP
For ESXi, the following VI CLI 4.0 command can be used:
# vicfg-mpath –G –-server <IP> --username <X> --password <Y>
MASK_PATH
NMP
LUN path masking is done via the MASK_PATH Plug-in.
28
VI4 - Mod 3 - Slide
Viewing Plugin Information (ctd)
Rules appear in the order that they are evaluated [0 – 65535]
Rules are stored in the /etc/vmware/esx.conf file. To list them, run the following command:
# esxcli corestorage claimrule list
Rule Class Type Plugin Matches
0 runtime transport NMP transport=usb
1 runtime transport NMP transport=sata
2 runtime transport NMP transport=ide
3 runtime transport NMP transport=block
4 runtime transport NMP transport=unknown
101 runtime vendor MASK_PATH vendor=DELL model=Universal Xport
101 file vendor MASK_PATH vendor=DELL model=Universal Xport
65535 runtime vendor NMP vendor=* model=*
Dell requested that we hide these array pseudo devices by defaultAny USB storage will be
claimed by the NMP plugin
Any storage not claimed by a previous rule will be claimed by NMP
The class column tells us if the rules are in the esx.conf (file) or if they are in the VMkernel (runtime).
29
VI4 - Mod 3 - Slide
Viewing Plugin Information (ctd)
Storage paths are defined based on the following parameters:
Vendor/model strings
Transportation, such as SATA, IDE, Fibre Channel, and so on
Location of a specific adapter, target, or LUN
Device driver, for example, Mega-RAID
The NMP claims all paths connected to storage devices that use the USB, SATA, IDE, and Block SCSI transportation.
The MASK_PATH module claims all paths connected to Universal Xport by Dell.
The MASK_PATH module is used to mask paths from your host.
The last rule of vendor=* model=* is a catch-all for any arrays that do not match any of the previous rules.
30
VI4 - Mod 3 - Slide
Viewing Device InformationThe command esxcli nmp device list lists all devices managed by the NMP plug-in and the configuration of that device, e.g.:
# esxcli nmp device list
naa.600601601d311f001ee294d9e7e2dd11
Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11)
Storage Array Type: VMW_SATP_CX
Storage Array Type Device Config: {navireg ipfilter}
Path Selection Policy: VMW_PSP_MRU
Path Selection Policy Device Config: Current Path=vmhba33:C0:T0:L1
Working Paths: vmhba33:C0:T0:L1
mpx.vmhba1:C0:T0:L0
Device Display Name: Local VMware Disk (mpx.vmhba1:C0:T0:L0)
Storage Array Type: VMW_SATP_LOCAL
Storage Array Type Device Config:
Path Selection Policy: VMW_PSP_FIXED
Path Selection Policy Device Config: {preferred=vmhba1:C0:T0:L0;current=vmhba1:C0:T0:L0}
Working Paths: vmhba1:C0:T0:L0
Specific configuration for EMC Clariion & Invista products
mpx is used as an identifier for devices that do not have their own unique ids
NAA is the Network Addressing Authority (NAA) identifier guaranteed to be unique
31
VI4 - Mod 3 - Slide
Viewing Device Information (ctd)
Get current path information for a specified storage device managed by the NMP.
# esxcli nmp device list -d naa.600601604320170080d407794f10dd11
naa.600601604320170080d407794f10dd11
Device Display Name: DGC Fibre Channel Disk (naa.600601604320170080d407794f10dd11)
Storage Array Type: VMW_SATP_CX
Storage Array Type Device Config: {navireg ipfilter}
Path Selection Policy: VMW_PSP_MRU
Path Selection Policy Device Config: Current Path=vmhba2:C0:T0:L0
Working Paths: vmhba2:C0:T0:L0
32
VI4 - Mod 3 - Slide
Viewing Device Information (ctd)Lists all paths available for a specified storage device on ESX:
# esxcfg-mpath -b -d naa.600601601d311f001ee294d9e7e2dd11
naa.600601601d311f001ee294d9e7e2dd11 : DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11)
vmhba33:C0:T0:L1 LUN:1 state:active iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b Target: IQN=iqn.1992-04.com.emc:cx.ck200083700716.b0 Alias= Session=00023d000001 PortalTag=1
vmhba33:C0:T1:L1 LUN:1 state:standby iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b Target: IQN=iqn.1992-04.com.emc:cx.ck200083700716.a0 Alias= Session=00023d000001 PortalTag=2
ESXi has an equivalent vicfg-mpath command.
33
VI4 - Mod 3 - Slide
Viewing Device Information (ctd)
# esxcfg-mpath -l -d naa.6006016043201700d67a179ab32fdc11
iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b-00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.a0,t,2-naa.600601601d311f001ee294d9e7e2dd11
Runtime Name: vmhba33:C0:T1:L1
Device: naa.600601601d311f001ee294d9e7e2dd11
Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11)
Adapter: vmhba33 Channel: 0 Target: 1 LUN: 1
Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b
Target Identifier: 00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.a0,t,2
Plugin: NMP
State: standby
Transport: iscsi
Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b
Target Transport Details: IQN=iqn.1992-04.com.emc:cx.ck200083700716.a0 Alias= Session=00023d000001 PortalTag=2
iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b-00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.b0,t,1-naa.600601601d311f001ee294d9e7e2dd11
Runtime Name: vmhba33:C0:T0:L1
Device: naa.600601601d311f001ee294d9e7e2dd11
Device Display Name: DGC iSCSI Disk (naa.600601601d311f001ee294d9e7e2dd11)
Adapter: vmhba33 Channel: 0 Target: 0 LUN: 1
Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b
Target Identifier: 00023d000001,iqn.1992-04.com.emc:cx.ck200083700716.b0,t,1
Plugin: NMP
State: active
Transport: iscsi
Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-h33-34f33b4b
Target Transport Details: IQN=iqn.1992-04.com.emc:cx.ck200083700716.b0 Alias= Session=00023d000001 PortalTag=1
Storage array (target) iSCSI Qualified Names (IQNs)
34
VI4 - Mod 3 - Slide
Viewing Device Information (ctd)
Any of the following commands will display the active path:
esxcli nmp path list -d <naa.id>
esxcli nmp device list -d <naa.id>
esxcli nmp psp getconfig -d <naa.id>
This information can also be found in the multipathing information of the storage section in the vSphere client.
35
VI4 - Mod 3 - Slide
Third-Party Multipathing Plug-ins (MPPs)
You can install the third-party multipathing plug-ins (MPPs) when you need to change specific load balancing and failover characteristics of ESX/ESXi.
The third-party MPPs replace the behaviour of the NMP and entirely take control over the path failover and the load balancing operations for certain specified storage devices.
36
VI4 - Mod 3 - Slide
Third-Party SATP & PSP
Third-party SATP
Generally developed by third-party hardware manufacturers who have ‘expert’ knowledge of the behaviour of their storage devices.
Accommodates specific characteristics of storage arrays and facilitates support for new arrays.
Third-party PSP
Generally developed by third-party software companies.
More complex I/O load balancing algorithms.
NMP coordination
Third-party SATPs and PSPs are coordinated by the NMP, and can be simultaneously used with the VMware SATPs and PSPs.
37
VI4 - Mod 3 - Slide
Troubleshooting the PSA
Scenario 1
Customer has an issue with a third-party PSP or SATP
Recommendation might be to unplug the third party PSP and plug-in VMware’s array specific plug-in or the generic A/A or A/P plug-in.
This recommendation will work for SATPs only if ESX ships a SATP that supports the given array. This may not always be the case.
Scenario 2
Customer has a problem with a third-party MPP plug-in, for example, EMC Powerpath.
Recommendation might be to unplug the third party product and to use the VMware NMP (Native Multipath Plug-in) to verify if it is the cause of the problem.
Having the customer switch out the module to do a 1st level triage of what is causing the problem is a reasonable course of action.
38
VI4 - Mod 3 - Slide
PSA Case Studies
Before starting into these case studies, you need to be aware of some changes to SCSI support in version 4.0.
The ESX 4 VMkernel is now SCSI-3 compliant and is capable of using SCSI-3 specific commands & features.
Two new storage features are introduced: Target Port Group Support (TPGS) and Asymmetric Logical Unit Access (ALUA).
TPGS allows new storage devices to be discovered automatically and typically presents a single port to a server while handling load-balancing & failover at the back-end.
Since target ports could be on different physical units, ALUA allows different levels of access for target ports to each LUN. ALUA will route I/O to a particular port to achieve best performance.
39
VI4 - Mod 3 - Slide
PSA Case Study #1
Trying to present an iSCSI LUN from a NETAPP FAS 250 filer to ESX 4 beta 2 resulted in no LUN discovery and the following /var/log/vmkernel errors:ScsiScan: SCSILogPathInquiry:641: Path 'vmhba34:C0:T0:L0': Vendor: 'NETAPP '
Model: ‘ LUN ' Rev: '0.2 '
ScsiScan: SCSILogPathInquiry:642: Type: 0x0, ANSI rev: 4
ScsiClaimrule: SCSIAddClaimrulesSessionStart:291: Starting claimrules session. Key = 0x40404040.
NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1.
NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1.
SATP_ALUA: satp_alua_initDevicePrivateData: Done with creation of dpd 0x41000e7c11a0.
WARNING: SATP_ALUA: satp_alua_getTargetPortInfo: Could not find relative target port ID for path "vmhba34:C0:T0:L0" - Not found (195887107)
NMP: nmp_SatpClaimPath: SAT "SATP_ALUA" could not add path "vmhba34:C0:T0:L0" for device "Unregistered device". Error Not found
WARNING: NMP: nmp_AddPathToDevice: The physical path "vmhba34:C0:T0:L0" for NMP device "Unregistered device" could not be claimed by SATP "SATP_ALUA". Not found
WARNING: NMP: nmp_DeviceAlloc: nmp_AddPathToDevice failed Not found (195887107).
WARNING: NMP: nmp_DeviceAlloc: Could not allocate NMP device.
WARNING: ScsiPath: SCSIClaimPath:3487: Plugin 'NMP' had an error (Not found) while claiming path 'vmhba34:C0:T0:L0'.Skipping the path.
ScsiClaimrule: SCSIClaimruleRunOnPath:734: Plugin NMP specified by claimrule 65535 was not able to claim path vmhba34:C0:T0:L0. Busy
ScsiClaimrule: SCSIClaimruleRun:809: Error claiming path vmhba34:C0:T0:L0. Busy.
Problem associating the SATP called SATP_ALUA with paths for this target
40
VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd)
# esxcfg-mpath –l
.
iqn.1998-01.com.vmware:cs-tse-f117-525f2d10--
Runtime Name: vmhba34:C0:T0:L0
Device:
Device Display Name:
Adapter: vmhba34 Channel: 0 Target: 0 LUN: 0
Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10
Target Identifier:
Plugin: (unclaimed)
State: active
Transport: iscsi
Adapter Transport Details: Unavailable or path is unclaimed .
Target Transport Details: Unavailable or path is unclaimed .
Root Cause: The NetApp array in this case is an old model. It did not support ALUA. But the only SATP that we had in beta was for NetApp arrays which supported ALUA. This is why NMP could not claim the LUNs from this array. It expected them to be ALUA compatible.
41
VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd)# esxcli nmp satp listrules -s SATP_ALUA
Name Vendor Model Driver Options Claim Options Description
SATP_ALUA HSV101 tpgs_on EVA 3000 GL with ALUA
SATP_ALUA HSV111 tpgs_on EVA 5000 GL with ALUA
SATP_ALUA HSV200 tpgs_on EVA 4000/6000 XL with ALUA
SATP_ALUA HSV210 tpgs_on EVA 8000/8100 XL with ALUA
SATP_ALUA NETAPP tpgs_on NetApp with ALUA
SATP_ALUA HP MSA2012sa tpgs_on HP MSA A/A with ALUA
SATP_ALUA Intel Multi-Flex tpgs_on Intel Promise
# grep -i NETAPP /etc/vmware/esx.conf
/storage/plugin/NMP/config[SATP_ALUA]/rules[0004]/description = "NetApp with ALUA"
/storage/plugin/NMP/config[SATP_ALUA]/rules[0004]/vendor = "NETAPP"
# esxcli nmp satp deleteRule -V NETAPP -s SATP_ALUA
# esxcli nmp satp listrules -s SATP_ALUA
Name Vendor Model Driver Options Claim Options Description
SATP_ALUA HSV101 tpgs_on EVA 3000 GL with ALUA
SATP_ALUA HSV111 tpgs_on EVA 5000 GL with ALUA
SATP_ALUA HSV200 tpgs_on EVA 4000/6000 XL with ALUA
SATP_ALUA HSV210 tpgs_on EVA 8000/8100 XL with ALUA
SATP_ALUA HP MSA2012sa tpgs_on HP MSA A/A with ALUA
SATP_ALUA Intel Multi-Flex tpgs_on Intel Promise
# grep -i NETAPP /etc/vmware/esx.conf
#
42
VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd)
Rescan the SAN for the Software iSCSI initiator
ScsiScan: SCSILogPathInquiry:641: Path 'vmhba34:C0:T0:L0': Vendor: 'NETAPP ' Model: 'LUN ' Rev: '0.2 '
ScsiScan: SCSILogPathInquiry:642: Type: 0x0, ANSI rev: 4
ScsiClaimrule: SCSIAddClaimrulesSessionStart:291: Starting claimrules session. Key = 0x40404040.
NMP: vmk_NmpPathGroupMovePath: Path "vmhba34:C0:T0:L0" state changed from "dead" to "active"
ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba34:C0:T0:L0'
NMP: nmp_DeviceUpdatePathStates: The PSP selected path "vmhba34:C0:T0:L0" to activate for NMP device "Unregistered device".
ScsiDevice: vmk_ScsiAllocateDevice:1571: Alloc'd device 0x41000d83e600
NMP: nmp_RegisterDevice: Register NMP device with primary uid 'naa.60a9800068704c6f54344b645a6d5876' and 1 total uids.
.
.
NMP: nmp_RegisterDevice: Registration of NMP device with primary uid 'naa.60a9800068704c6f54344b645a6d5876' and name of 'naa.60a9800068704c6f54344b645a6d5876' is completed successfully.
43
VI4 - Mod 3 - Slide
PSA Case Study #1 (ctd)
# esxcfg-mpath -l -d naa.60a9800068704c6f54344b645a6d5876
iqn.1998-01.com.vmware:cs-tse-f117-525f2d10-00023d000001,iqn.1992-08.com.netapp:sn.84228148,t,1-naa.60a9800068704c6f54344b645a6d5876
Runtime Name: vmhba34:C0:T0:L0
Device: naa.60a9800068704c6f54344b645a6d5876
Device Display Name: NETAPP iSCSI Disk (naa.60a9800068704c6f54344b645a6d5876)
Adapter: vmhba34 Channel: 0 Target: 0 LUN: 0
Adapter Identifier: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10
Target Identifier: 00023d000001,iqn.1992-08.com.netapp:sn.84228148,t,1
Plugin: NMP
State: active
Transport: iscsi
Adapter Transport Details: iqn.1998-01.com.vmware:cs-tse-f117-525f2d10
Target Transport Details: IQN=iqn.1992-08.com.netapp:sn.84228148 Alias= Session=00023d000001 PortalTag=1
# esxcli nmp device list -d naa.60a9800068704c6f54344b645a6d5876naa.60a9800068704c6f54344b645a6d5876 Device Display Name: NETAPP iSCSI Disk (naa.60a9800068704c6f54344b645a6d5876) Storage Array Type: SATP_DEFAULT_AA Storage Array Type Device Config: Path Selection Policy: PSP_FIXED Path Selection Policy Device Config: {preferred=vmhba34:C0:T0:L0;current=vmhba34:C0:T0:L0} Working Paths: vmhba34:C0:T0:L0
Since there is no defined SATP for the NetApp, i.e. SATP_ALUA, NMP instead associates it with the default A/A SATP.
44
VI4 - Mod 3 - Slide
PSA Case Study #2
LUNs from the same NetApp filer are using different SATPs
naa.60a98000486e5351476f4b3670724b58 Selected Paths : vmhba5:C0:T1:L5 Storage Array Type: SATP_ALUA Storage Array Type Device Config: {implicit_support=on;explicit_support=off;explicit_allow=on;alua_followover=on;TPG_id=1;TPG_state=ANO;RTP_id=4;RTP_health Path Policy: PSP_MRU PSP Config String: Current Path=vmhba5:C0:T1:L5
naa.60a98000486e544c64344d3345425331 Selected Paths : vmhba2:C0:T0:L50 Storage Array Type: SATP_DEFAULT_AA Storage Array Type Device Config: Path Policy: PSP_FIXED PSP Config String: {preferred=vmhba2:C0:T0:L50;current=vmhba2:C0:T0:L50}
45
VI4 - Mod 3 - Slide
PSA Case Study #2 (ctd)
In this beta version of ESX, NetApp LUNs can be claimed by either SATP_ALUA or SATP_DEFAULT_AA depending on whether the NetApp Initiator Group (igroup) is configured for ALUA mode or not on the array.
Looking at the vmkernel logs, the NetApp is misconfigured - one storage array controller (target) has ALUA mode turned ON, and the other storage array controller (target) has it turned OFF.
Therefore, the device gets claimed by either ALUA or DEFAULT_AA depending on which path is discovered first (controller with ALUA or controller without ALUA).
For ALUA we check that TPGS (Target Port Group Support) bit in the standard SCSI Inquiry is set, and this is specified by using "tpgs_on" claim options.
46
VI4 - Mod 3 - Slide
PSA Case Study #2 (ctd)
In the vmkernel logs, the Compared claim opt 'tpgs_on' message tells us if TGPS is on or off.
The next Plugin 'NMP' claimed path message tells us which path it was testing. You can see that "'tpgs_on'; result 0" (not set) is followed by vmhbaX:C0:T0:LY and "'tpgs_on'; result 1" (set) is followed by vmhbaX:C0:T1:LY.
Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.022 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.031 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.051 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L7' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.060 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.080 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L7' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.089 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.097 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.118 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L50' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.126 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.147 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L50' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.156 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.164 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 0. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.185 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T0:L56' Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.194 cpu5:4110)NMP: nmp_SatpMatchClaimOptions: Compared claim opt 'tpgs_on'; result 1. Nov 23 10:10:07 vmshe-hp380-11 vmkernel: 2:18:04:11.214 cpu5:4110)ScsiPath: SCSIClaimPath:3478: Plugin 'NMP' claimed path 'vmhba2:C0:T1:L56'
47
VI4 - Mod 3 - Slide
Lesson 1 Summary
ESX 4.0 has a new Pluggable Storage Architecture.
VMware provides a default Native Multipathing Module (NMP) for the PSA.
The NMP has sub-plugins for failover and load balancing.
Storage Array Type Plug-in (SATP) handles failover & Path Selection Plug-in (PSP) handles load-balancing.
PSA facilitates storage array vendors and software vendors develop their own modules specific to individual arrays.
48
VI4 - Mod 3 - Slide
Lesson 1 - Lab 1
Lab 1 involves examining the Pluggable Storage Architecture
View PSA, NMP, SATP & PSP information on the student host
Replacing the Path Selection Plug-in (PSP)
Choose Fixed or MRU
Use the MASK_PATH plugin to mask LUNs from the host.
49
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK extend
Lesson 9 - Storage CLI enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
50
VI4 - Mod 3 - Slide
Windows Clustering Changes
In Windows Server 2003 Server Clustering, SCSI bus resets are used to break SCSI reservations to allow another controller take control of the disks.
The problem with SCSI bus resets is that they impact other disks sharing the SCSI bus which do not hold a SCSI reservation.
In Windows Server 2008 Failover Clustering, SCSI bus resets are no longer used. It uses persistent reservations instead.
This means that directly attached SCSI storage will no longer be supported in Windows 2008 for failover clustering.
Serially Attached Storage (SAS), Fibre Channel, and iSCSI will be the only supported technology for Windows 2008 by Microsoft, but this does not necessarily mean that VMware will support these technologies for clustering Virtual Machines.
Other SCSI devices on the same bus as the reserved LUN are not impacted by SCSI reservation clearing & pre-empting operations.
51
VI4 - Mod 3 - Slide
ESX 4, SCSI-3 & MSCS
In order to support Microsoft Windows 2008 (Longhorn) Failover Cluster, there were 2 major requirements needed in ESX 4:
The passing through of SCSI-3 persistent reservations from the Virtual Machine to the underlying physical storage.
A virtual SAS controller in the Guest OS since direct attached storage is no longer supported.
You will need to modify the controller type by clicking Change Controller Type during VM creation. Choose LSI Logic SAS for Windows Server 2008.
The controller is set to LSI Logic Parallel for Windows 2000 Server and Windows Server 2003.
ESX 4.0 is still limited to 2 nodes per cluster.
52
VI4 - Mod 3 - Slide
A New Virtual SAS Controller
53
VI4 - Mod 3 - Slide
ESX 4 & SCSI-3
In ESX 3.5 and earlier, VMware Virtual Machines emulate SCSI-2 disk reservation protocol only and does not support applications that use SCSI-3 persistent reservations.
Microsoft Windows 2008 requires SCSI-3 persistent reservation support for Clustering.
ESX 4.0 enables the passing through of SCSI-3 persistent reservations from the Virtual Machine to the underlying physical storage.
Support for SCSI-3 persistent reservations is only for MSCS with Windows 2008 – there is no other support case in VI4 for SCSI-3 persistent reservations.
One additional caveat is that VMFS still uses SCSI-2 style (reserve/release) reservations.
54
VI4 - Mod 3 - Slide
Persistent Reservations
Persistent Reserve refers to a set of SCSI-3 commands and command options which provide SCSI initiators with the ability to establish, pre-empt, query, and reset a reservation policy with a specified target device.
The persistent reservation mechanism allows the preservation of reservation operations across recovery actions.
Persistent reservations are not removed by hard reset, logical unit reset, Initiator/Target access loss or power loss (by default).
These replace the SCSI-2 Reserve/Release commands.
55
VI4 - Mod 3 - Slide
Persistent Reservations (ctd)
Before a persistent reservation may be established, the cluster node must register a reservation key for each Initiator/Target combination with the device server (SCSI controller).
The key is used in the PERSISTENT RESERVE IN command to:
identify which Initiator/Target combinations are registered
identify which Initiator/Target combination, if any, holds the persistent reservation.
The key is used in the PERSISTENT RESERVE OUT command to:
to verify the Initiator/Target combination is registered
to reserve or release the device
to specify which registrations or persistent reservation to pre-empt.
56
VI4 - Mod 3 - Slide
Persistent Reservations (ctd)
The Persistent Reservation held by a failing Initiator/Target combination may be pre-empted by another Initiator/Target combination as part of its recovery process.
Persistent Reservations are retained by the device server until released, pre-empted, or cleared.
57
VI4 - Mod 3 - Slide
Persistent Reservations
SAN
Shared Storage Device
Primary
Node 1 registers a key for its initiator/target combination on the array.
Secondary
Node 2 registers a key for its initiator/target combination on the array.Both nodes can now reserve, release or pre-empt another node’s reservation.Node 1 now places a SCSI reservation on the LUN and can start I/O.
Boom!
Node 1 suffers a catastrophic failure, but its reservation is persistent.Since Node 2 is registered, it can use its key to determine who the lock owner is and clear or pre-empt the reservation (assume ownership of the lock). It no longer has to send SCSI Bus Resets which could affect other disks on the bus.
MicrosoftWindows
2008FailoverCluster
58
VI4 - Mod 3 - Slide
Windows 2008 Clustering Limitations (RC)The following functionality is not currently supported:
Clustering on iSCSI or NFS disks.
Mixed environments. That is, configurations where one cluster node is running a different version of ESX than another cluster node.
Mixed HBA environments (QLogic and Emulex) on the same host.
Clustered virtual machines as part of VMware clusters (DRS or HA).
Use of MSCS in conjunction with VMware Fault Tolerance.
Migration with VMotion of clustered virtual machines.
In addition, vSphere MSCS setups have the following limitations:
You must use hardware version 7 with ESX 4.0.
With native multipathing (NMP), clustering is not supported when the path policy is set to round robin.
While the boot disks of clustered virtual machines may be located on a SAN, but the ESX/ESXi host must boot from local storage .
59
VI4 - Mod 3 - Slide
Troubleshooting Persistent Reservations
At the time of writing there are no tools available on the ESX service console to query the Registrations and Persistent Reservations made by Windows 2008 cluster nodes.
There are some third party tools which can be run in the guest to query this information.
SCSI Command Utility – scu
SG driver Uitlity – sg3_utils
There are PRs open for getting persistent reservations query options into vmkfstools (PR 342962) and getting sg3_utils into the COS (PR 358032).
The sg3_utils should be there at GA.
Enhancements to vmkfstools are probably K/L+6 at the earliest.
60
VI4 - Mod 3 - Slide
Lesson 2 Summary
A new SCSI-3 compliant SCSI Reservation mechanism known as Persistent Reservations supersedes the older SCSI-2 Reserve/Release mechanism.
Introduced on ESX 4.0 to support Microsoft Windows 2008 (Longhorn) Clustering.
A new Virtual SAS Controller has also been implemented in ESX 4 and this needs to be chosen as the controller type if deploying Microsoft Windows 2008 Failover Clustering in Virtual Machines.
VMFS continues to use SCSI-2 style (reserve/release) reservations.
61
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
62
VI4 - Mod 3 - Slide
iSCSI Enhancements
ESX 4 includes an updated iSCSI stack which offers improvements to both software iSCSI (initiator that runs at the ESX layer) and hardware iSCSI (a hardware-optimized iSCSI HBA).
For both software and hardware iSCSI, functionality (e.g. CHAP support, digest acceleration, etc.) and performance are improved.
Software iSCSI can now be configured to use host based multipathing if you have more than one physical network adapter.
In the new ESX 4.0 Software iSCSI stack, there is no longer any requirement to have a Service Console connection to communicate to an iSCSI target.
63
VI4 - Mod 3 - Slide
Software iSCSI Enhancements
iSCSI Advanced Settings
In particular, data integrity checks in the form of digests.
CHAP Parameters Settings
A user will be able to specify CHAP parameters as per-target CHAP and mutual per-target CHAP.
Inheritance model of parameters.
A global set of configuration parameters can be set on the initiator and propagated down to all targets.
Per target/discovery level configuration.
Configuration settings can now be set on a per target basis which means that a customer can uniquely configure parameters for each array discovered by the initiator.
64
VI4 - Mod 3 - Slide
Advanced Settings: Header & Data Digests
To protect the integrity of iSCSI headers and data, the iSCSI protocol defines error correction methods known as header digests and data digests.
These digests pertain to, respectively, the header and SCSI data being transferred between iSCSI initiators and targets, in both directions.
Header and data digests check the data integrity beyond the integrity checks that other networking layers provide, such as TCP and Ethernet, i.e. dropped packets or frames.
They are disabled by default in the Advanced Settings but may be enabled.
Enabling header and data digests does require additional processing for both the initiator and the target and can affect throughput performance and CPU usage overhead.
65
VI4 - Mod 3 - Slide
iSCSI Advanced Settings - Initiator
VMware recommends that you do not make any changes to the advanced iSCSI settings unless you are working with the VMware support team or otherwise have thorough information about the values to provide for the settings.
66
VI4 - Mod 3 - Slide
iSCSI Advanced Settings – Per Target
VMware recommends that you do not make any changes to the advanced iSCSI settings unless you are working with the VMware support team or otherwise have thorough information about the values to provide for the settings.
Inherit from parent implies that we can use the settings on the initiator or we can set them here on a per target basis
67
VI4 - Mod 3 - Slide
Challenge Handshake Authentication Protocol (CHAP)
ESX/ESXi supports the Challenge Handshake Authentication Protocol (CHAP) that your iSCSI initiators can use for authentication purposes.
ESX/ESXi 4.0 supports unidirectional and bidirectional CHAP authentication for iSCSI. This is different to previous versions of ESX/ESXi; the supported unidirectional CHAP only.
In unidirectional CHAP authentication, the target authenticates the initiator, but the initiator does not authenticate the target.
In bidirectional, or mutual, CHAP authentication, an additional level of security enables the initiator to authenticate the target.
ESX/ESXi 4.0 also supports per-target CHAP authentication, which enables you to configure different credentials for each target to achieve greater target refinement.
68
VI4 - Mod 3 - Slide
CHAP Parameter Settings
Options are:1. Prohibited2. Discouraged3. Preferred4. Required
The target authenticates the initiator
The initiator authenticates
the target
69
VI4 - Mod 3 - Slide
Authentication Options
Required : The host requires successful CHAP authentication for the login to succeed. If target does not authenticate, then the login is failed.
Preferred: The host uses CHAP, but it is OK for the target to not authenticate.
Discouraged: The host uses non-CHAP but supports CHAP, so if target authenticates, it is OK.
Prohibited: The host does not use CHAP. Any authentication request is failed.
70
VI4 - Mod 3 - Slide
Software iSCSI Multipathing – Port Binding
You can now create a port binding between a physical NIC and a iSCSI VMkernel port in ESX 4.0.
Using the “port binding" feature, users can map the multiple iSCSI VMkernel ports to different physical NICs. This will enable the software iSCSI initiator to use multiple physical NICs for I/O transfer.
Connecting the software iSCSI initiator to the VMkernel ports can only be done from the CLI using the esxcli swiscsi commands.
Host based multipathing can then manage the paths to the LUN.
In addition, Round Robin path policy can be configured to simultaneously use more than one physical NIC for the iSCSI traffic to the iSCSI.
71
VI4 - Mod 3 - Slide
Software iSCSI Multipathing – Port Binding (ctd)
# esxcfg-mpath -b -d naa.60a980006e424b396c5a4e31304b2f56
naa.60a980006e424b396c5a4e31304b2f56 : NETAPP iSCSI Disk (naa.60a980006e424b396c5a4e31304b2f56)
vmhba33:C1:T2:L10 LUN:10 state:active iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-21891206 Target: IQN=iqn.1992-08.com.netapp:sn.99921696 Alias= Session=00023d000001 PortalTag=1000
vmhba33:C0:T2:L10 LUN:10 state:active iscsi Adapter: iqn.1998-01.com.vmware:cs-tse-h33-21891206 Target: IQN=iqn.1992-08.com.netapp:sn.99921696 Alias= Session=00023d000002 PortalTag=1000
72
VI4 - Mod 3 - Slide
Software iSCSI Multipathing – Port Binding (ctd)
Note that IP Routing is not supported with ‘port binding’ in ESX 4.0. Therefore you will need to create a VMkernel port and the target IP should directly routable thru that VMkernel port for each subnet.
This is the error you will get trying to route to a target outside of your subnet.
Jan 9 15:30:33 vmkernel: 0:00:03:20.859 cpu5:7932)Tcpip_Vmk: 165: cannot honor socket binding with vmk1 (src_subnet=0xa154400, dst_subnet=0xa154800)Jan 9 15:30:33 iscsid: cannot make connection to 10.21.74.242:3260 (101)Jan 9 15:30:33 vmkernel: 0:00:03:20.859 cpu5:7932)Tcpip_Vmk: 165: cannot honor socket binding with vmk2 (src_subnet=0xa154400, dst_subnet=0xa154800)Jan 9 15:30:33 iscsid: connection to discovery address 10.21.74.242 failedJan 9 15:30:33 iscsid: connection login retries (reopen_max) 5 exceededJan 9 15:30:33 iscsid: cannot make connection to 10.21.74.242:3260 (101)Jan 9 15:30:33 iscsid: connection to discovery address 10.21.74.242 failedJan 9 15:30:33 iscsid: connection login retries (reopen_max) 5 exceeded
73
VI4 - Mod 3 - Slide
Hardware iSCSI Limitations (RC)
Mutual Chap is disabled.
Discovery is supported by IP address only (storage array name discovery not supported).
Running with the Hardware and Software iSCSI initiator enabled on the same host at the same time is not supported.
74
VI4 - Mod 3 - Slide
Troubleshooting iSCSI
Start with /var/log/vmkiscsid.log.
There is no longer an /etc/vmkiscsid.conf configuration but rather all iSCSI configuration information is now held in a SQLite (http://www.sqlite.org/) database.
To see iSCSI configuration information, you can run:
"vmkiscsid --dump-db=/tmp/db.txt"
Useful for CHAP authentication and general login issues.
Also has useful information regarding each of the discovered targets.
The vmkiscsid –x option allows you to send ANY SQL command to the database. This should be used with extreme care.
vmkiscsid -x “select * from nodes;”
vmkiscsid -x “select * from nodes where \`node.name\`='iqn.2001-2003.com.equalogic;”
!
75
VI4 - Mod 3 - Slide
Lesson 3 Summary
Major performance improvements have been added to iSCSI functionality in ESX 4.0.
iSCSI improvements also include data integrity checks in the form of digests.
There is also support for bi-directional and per target CHAP.
Port Binding support for software iSCSI, enabling multipathing.
No longer a requirement for a Service Console Network Interface.
76
VI4 - Mod 3 - Slide
Lesson 3 - Lab 2
Lab 2 involves configuring multipathing on the Software iSCSI initiator
Enable the Software iSCSI initiator
Modify the vSwitch to have 2 uplinks and 2 VMkernel ports
Use the esxcli swiscsi command to create port bindings
77
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
78
VI4 - Mod 3 - Slide
GUI Changes - Display Device Info
Note that there are no further references to vmhbaC:T:L. Unique device identifiers such as the NAA id are now used.
79
VI4 - Mod 3 - Slide
GUI Changes - Display HBA Configuration Info
Again, notice the use of NAA ids rather than vmhbaC:T:L.
80
VI4 - Mod 3 - Slide
GUI Changes - Display Path Info
Note the reference to the PSP &
SATP
Note the (I/O)status designating
the active path
81
VI4 - Mod 3 - Slide
GUI Changes - Data Center Rescan
82
VI4 - Mod 3 - Slide
Degraded Status
If we detect less than 2 HBAs or 2 Targets in the paths of the datastore, we mark the datastore multipathing status as “Partial/No Redundancy“ in the Storage Views.
83
VI4 - Mod 3 - Slide
Storage Administration
VI4 also provides new monitoring, reporting and alarm features for storage management.
This now gives an administrator of a vSphere the ability to:
1. Manage access/permissions of datastores/folders
2. Have visibility of a Virtual Machine’s connectivity to the storage infrastructure
3. Account for disk space utilization
4. Provide notification in case of specific usage conditions
84
VI4 - Mod 3 - Slide
Datastore/Folders Permissions
Administrators now have the capability to organize datastores into folders and set permissions on a per folder or per datastore level.
An administrator can now block the creation of virtual machines or the creation of snapshots on a per datastore basis or through the use of folders.
To set a permission on an individual datastore, select the datastore and then the Permissions tab. Right clicking on this Permissions window will allow you to select the ‘Add Permission’ option.
To set a permission on a folder, select the folder, right click on the folder and then select the ‘Add Permission’ option.
85
VI4 - Mod 3 - Slide
Datastore/Folder Privileges (ctd)
New privileges associated with
Datastore Objects
86
VI4 - Mod 3 - Slide
Storage Reporting & Insight
Storage Reporting and Insight is a solution combining a backend web service contained in VMware VirtualCenter Management Webservices (called the Storage Management Service – SMS) and vSphere Client UI plug-in geared towards giving more insight about the storage infrastructure on two aspects:
Storage Connectivity Datastores are from which target in the SAN?
Are there redundant paths to the Virtual Machine’s storage?
Capacity Utilization How much snapshot space does a VM consume?
How much space on a datastore is used for snapshots?
The new Storage tab is available for all managed entities in VI4 (except networks).
87
VI4 - Mod 3 - Slide
SMS vSphere Storage Client Plugin
Change view here.Default is to ‘Show all
Virtual Machines’
There is also a Storage View tab on each ESX managed by VC, as well as each VM.
88
VI4 - Mod 3 - Slide
SMS Architecture Overview
SMS computes capacity utilization and storage insight using data stored in vCenter database.
No other out-of-band information is referenced currently.
Periodically fetches data from vCenter database
Less burden on vCenter.
Makes direct database calls.
Computes information and stores it in in-memory cache.
Front end vSphere Client plug-in uses Flash for storage reporting and C# (C Sharp) for storage maps.
89
VI4 - Mod 3 - Slide
Storage Views: Reports on Storage Utilization
A new reporting feature provides visibility into the relationship between vSphere entities and storage entities, e.g.
Datastore to VM or Host
VM/Host/Cluster to Datastore
VM to SCSI volume, path, adapter or target
The reports provided are searchable & customizable list views
Reports can be exported into files (e.g. “.csv”). Simply click on the Reports window and select ‘Export List…’. Select save type as CSV.
90
VI4 - Mod 3 - Slide
Storage Views: Maps For Storage Connectivity
This is an easy way to view how many paths a VM has to its storage, and what targets it can see. It can also assist in troubleshooting by showing up problem entities
91
VI4 - Mod 3 - Slide
Storage View: Reports (ctd)
Volume capacity info Snapshot capacity info
Searchable & customizable list views
92
VI4 - Mod 3 - Slide
Datastore Monitoring & Alarms
vSphere introduces new datastore and VM-specific alarms/alerts on storage events:
New datastore alarms:
Datastore disk usage %
Datastore Disk Over allocation %
Datastore connection state to all hosts
New VM alarms:
VM Total size on disk (GB)
VM Snapshot size (GB)
Customer’s can now track snapshot usage
93
VI4 - Mod 3 - Slide
New Storage Alarms
New Datastore specific AlarmsNew VM specific Alarms
This alarms allow the trackingof Thin Provisioned disks
This alarms will trigger ifa datastore becomes unavailble to the host
This alarms will trigger ifa snapshot delta filebecomes too large
94
VI4 - Mod 3 - Slide
Troubleshooting SMSLog level is read from:
C:\Program Files\VMware\Infrastructure\tomcat\webapps\sms\WEB-INF\classes\log4j.properties
Log level for RC is DEBUG.#SMS configuration
log4j.logger.com.vmware.vim.sms=DEBUG
Possible log levels in increasing order of verbosityFATAL, ERROR, WARN, INFO, DEBUG, and TRACE
SMS logs information to:C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\Logs\sms.log
Log files are rolling.
The sms.log stores the latest log messages. Older log files are named sms.log.<number> where bigger number represents older log file; i.e. sms.log.1 stores more recent logs than sms.log.2
Maximum size of individual log file is 1 MB.
95
VI4 - Mod 3 - Slide
Known IssuesSymptoms: The Storage views are not populated. The sms.log file reports an exception trying to connect to the SQL Server instance.
com.microsoft.sqlserver.jdbc.SQLServerException: The connection to the named instance has failed. Error: java.net.SocketTimeoutException: Receive timed out.
Root Cause: vCenter database is not reachable over the network by SMS after vCenter upgrade.
Fix: Enable TCP/IP through the SQL Server Configuration Manager. For each active IP address, ensure that:
Active = YesTCP
Dynamic Ports = 0
Then restart the SQL Server instance service.
Alternatively, if the previous vCenter installation was running a default SQL Server Express instance, you can manually uninstall the older SQL Server Express instance and reinstall vCenter.
96
VI4 - Mod 3 - Slide
Lesson 4 Summary
VI4 has been greatly enhanced to manage storage entities in a way that we’ve become familiar with for other vSphere entities.
Storage can now be managed via ‘folders’ and permissions can now be set on a per folder or per datastore basis.
The new Storage Plug-in and Storage Management Service allow administrators to view storage connectivity and capacity usage of storage and snapshots.
Data Center wide rescans of storage can now be initiated from VI4.
VI4 introduces new storage related alarms for storage usage.
97
VI4 - Mod 3 - Slide
Lesson 4 - Lab 3
Enable the Storage Plug-in.
Look at Data Center Storage View
Look at Cluster Storage View
Look at the ESX Storage View
Look at the VM Storage View
Create a Datastore folder
Add a datastore to the newly created folder
Add event/alarm for capacity
Trigger permission issue
Trigger capacity event/alarm
98
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
99
VI4 - Mod 3 - Slide
Traditional Snapshot Detection
When an ESX 3.x server finds a VMFS-3 LUN, it compares the SCSI_DiskID information returned from the storage array with the SCSI_DiskID information stored in the LVM Header.
If the two IDs don’t match, then by default, the VMFS-3 volume will not be mounted and thus be inaccessible.
A VMFS volume on ESX 3.x could be detected as a snapshot for a number of reasons:
LUN id changed
SCSI version supported by array changed (firmware upgrade)
Identifier type changed – Unit Serial Number vs NAA id
100
VI4 - Mod 3 - Slide
New Snapshot Detection Mechanism
When trying to determine if a device is a snapshot in ESX 4.0, the ESX uses a globally unique identifier to identify each LUN, typically the NAA (Network Addressing Authority) ID.
NAA ids are unique and are persistent across reboots.
There are many different globally unique identifiers. If the LUN does not support any of these globally unique identifiers, ESX will fall back to the Serial number + LUN id used in ESX 3.0.
101
VI4 - Mod 3 - Slide
SCSI_DiskId Structure
The internal VMkernel structure SCSI_DiskId is populated with information about a LUN.
This is stored in the metadata header of a VMFS volume.
if the LUN does have a globally unique (NAA) id, the field SCSI_DiskId.data.uid in the SCSI_DiskId structure will hold it.
If the NAA id in the SCSI_DiskId.data.uid stored in the metadata does not match the NAA id returned by the LUN, the ESX knows the LUN is a snapshot.
For older arrays that do not support NAA ids, the earlier algorithm is used where we compare other fields in the SCSI_DISKID structure to detect whether a LUN is a snapshot or not.
102
VI4 - Mod 3 - Slide
8:00:45:51.975 cpu4:81258)ScsiPath: 3685: Plugin 'NMP' claimed path 'vmhba33:C0:T1:L2'
8:00:45:51.975 cpu4:81258)ScsiPath: 3685: Plugin 'NMP' claimed path 'vmhba33:C0:T0:L2'
8:00:45:51.977 cpu2:81258)VMWARE SCSI Id: Id for vmhba33:C0:T0:L2
0x60 0x06 0x01 0x60 0x1d 0x31 0x1f 0x00 0xfc 0xa3 0xea 0x50 0x1b 0xed 0xdd 0x11 0x52 0x41 0x49 0x44 0x20 0x35
8:00:45:51.978 cpu2:81258)VMWARE SCSI Id: Id for vmhba33:C0:T1:L2
0x60 0x06 0x01 0x60 0x1d 0x31 0x1f 0x00 0xfc 0xa3 0xea 0x50 0x1b 0xed 0xdd 0x11 0x52 0x41 0x49 0x44 0x20 0x35
8:00:45:52.002 cpu2:81258)LVM: 7125: Device naa.600601601d311f00fca3ea501beddd11:1 detected to be a snapshot:
8:00:45:52.002 cpu2:81258)LVM: 7132: queried disk ID: <type 2, len 22, lun 2, devType 0, scsi 0, h(id) 3817547080305476947>
8:00:45:52.002 cpu2:81258)LVM: 7139: on-disk disk ID: <type 2, len 22, lun 1, devType 0, scsi 0, h(id) 6335084141271340065>
8:00:45:52.006 cpu2:81258)ScsiDevice: 1756: Successfully registered device "naa.600601601d311f00fca3ea501beddd11" from plugin "
Snapshot Log Messages
103
VI4 - Mod 3 - Slide
Resignature & Force-Mount
We have a new naming convention in ESX 4.
“Resignature” is equivalent to EnableResignature = 1 in ESX 3.x.
“Force-Mount” is equivalent to DisallowSnapshotLUN = 0 in ESX 3.x.
The advanced configuration options EnableResignature and DisallowSnapshotLUN have been replaced in ESX 4 with a new CLI utility called esxcfg-volume (vicfg-volume for ESXi).
However it is no longer necessary to handle snapshots via the CLI. Resignature and Force-mount now have full GUI support and VC does VMFS rescans on all hosts after a resignature operation.
Historically, the EnableResignature and DisallowSnapshotLUN were applied server wide and applied to all volumes on an ESX. The new Resignature and Force-mount are volume specific so offer much greater granularity in the handling of snapshots.
104
VI4 - Mod 3 - Slide
Persistent Or Non-Persistent MountsIf you use the GUI to force-mount a VMFS volume, it will make it a persistent mount which will remain in place through reboots of the ESX host. VC will not allow this volume to be resignatured.
If you use the CLI to force-mount a VMFS volume, you can choose whether it persists or not through reboots.
Through the GUI, the Add Storage Wizard now displays the VMFS label. Therefore if a device is not mounted, but it has a label associated with it, you can make the assumption that it is a snapshot, or to use ESX 4 terminology, a Volume Copy.
105
VI4 - Mod 3 - Slide
Mounting A Snapshot
OriginalVolume is still presented to
the ESX
Snapshot – notice that thevolume label is the same as
the original volume.
106
VI4 - Mod 3 - Slide
Snapshot Mount Options
Keep Existing Signature – this is a force-mount operation: similar to disabling DisallowSnapshots in ESX 3.x. New datastore has original UUID saved in the file system header.
If the original volume is already online, this option will not succeed and will print a ‘Cannot change the host configuration’ message when resolving the VMFS volumes..
Assign a new Signature – this is a resignature operation: similar to enabling EnableResignature in ESX 3.x. New datastore has a new UUID saved in the file system header.
Format the disk – destroys the data on the disk and creates a new VMFS volume on it.
107
VI4 - Mod 3 - Slide
New CLI Command: esxcfg-volumeThere is a new CLI command in ESX 4 for resignaturing VMFS snapshots. Note the difference between ‘-m’ and ‘-M’:
# esxcfg-volume
esxcfg-volume <options>
-l|--list List all volumes which have been
detected as snapshots/replicas.
-m|--mount <VMFS UUID/label> Mount a snapshot/replica volume,
if its original copy is not
online.
-u|--umount <VMFS UUID/label> Umount a snapshot/replica volume.
-r|--resignature <VMFS UUID/label> Resignature a snapshot/replica
volume.
-M|--persistent-mount <VMFS UUID/label> Mount a snapshot/replica volume
persistently, if its original
copy is not online.
-h|--help Show this message.
108
VI4 - Mod 3 - Slide
esxcfg-volume (ctd)
The difference between a mount and a persistent mount is that the persistent mounts will be maintained through reboots.
ESX manages this by adding entries for force mounts into the /etc/vmware/esx.conf.
A typical set of entries for a force mount look like:
/fs/vmfs[48d247dd-7971f45b-5ee4-0019993032e1]\ /forceMountedLvm/forceMount = "true"
/fs/vmfs[48d247dd-7971f45b-5ee4-0019993032e1]\ /forceMountedLvm/lvmName = "48d247da-b18fd17c-1da1-0019993032e1"
/fs/vmfs[48d247dd-7971f45b-5ee4-0019993032e1]\ /forceMountedLvm/readOnly = "false"
109
VI4 - Mod 3 - Slide
Mount With the Original Volume Still Online
/var/log # esxcfg-volume -l
VMFS3 UUID/label: 496f202f-3ff43d2e-7efe-001f29595f9d/Shared_VMFS_For_FT_VMs
Can mount: No (the original volume is still online)
Can resignature: Yes
Extent name: naa.600601601d311f00fca3ea501beddd11:1 range: 0 - 20223 (MB)
/var/log # esxcfg-volume -m 496f202f-3ff43d2e-7efe-001f29595f9d
Mounting volume 496f202f-3ff43d2e-7efe-001f29595f9d
Error: Unable to mount this VMFS3 volume due to the original volume is still online
110
VI4 - Mod 3 - Slide
esxcfg-volume (ctd)
In this next example, a clone LUN of a VMFS LUN is presented back to the same ESX server. Therefore we cannot use either the mount or the persistent mount options since the original LUN is already presented to the host so we will have to resignature:
# esxcfg-volume -l
VMFS3 UUID/label: 48d247dd-7971f45b-5ee4-0019993032e1/cormac_grow_vol
Can mount: Yes
Can resignature: No
Extent name: naa.6006016043201700f30570ed09f6da11:1 range: 0 - 15103 (MB)
111
VI4 - Mod 3 - Slide
esxcfg-volume (ctd)
# esxcfg-volume -r 48d247dd-7971f45b-5ee4-0019993032e1
Resignaturing volume 48d247dd-7971f45b-5ee4-0019993032e1
# vdf
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdg2 5044188 1595804 3192148 34% /
/dev/sdd1 248895 50780 185265 22% /boot
.
.
/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1
15466496 5183488 10283008 33% /vmfs/volumes/cormac_grow_vol
/vmfs/volumes/48d39951-19a5b934-67c3-0019993032e1
15466496 5183488 10283008 33% /vmfs/volumes/snap-397419fa-cormac_grow_vol
Warning – there is no vdf command in ESXi. However the df command reports on VMFS filesystems in ESXi.
112
VI4 - Mod 3 - Slide
Lesson 5 Summary
VI4 has a new improved method for handling snapshots.
There is now GUI support for handling snapshots.
There is also a new CLI command for handling snapshots esxcfg-volume or vicfg-volume.
Snapshot handling is now done on a per volume basis rather than a system wide basis.
Snapshots can be made to mount persistently through reboots without resignaturing.
113
VI4 - Mod 3 - Slide
Lesson 5 - Lab 4
Snapshot Handling in VI4
Create a VMFS volume on an already presented LUN
Change the presentation LUN Id.
Observe ESX 4.0 handling the LUN Id change.
Create a snapshot LUN on the storage array.
Present the snapshot LUN to the ESX.
Observe the VMkernel messages.
Force mount the LUN from the CLI/GUI.
114
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
115
VI4 - Mod 3 - Slide
Storage VMotion Introduction
What are the use cases for Storage VMotion?
Moving VMs from one VMFS to another because we want to upgrade the source VMFS.
Evacuating physical storage that is about to be retired.
Rebalancing data stores due to space considerations.
Rebalancing date stores due to I/O load.
Moving VMs to tiered storage with different service levels due to changing business requirements for that VM.
116
VI4 - Mod 3 - Slide
Upgrade VMotion
ESX3.0.1 introduced the concept of Upgrade VMotion – migrating the memory and storage of a Virtual Machine from VMFS-2 to VMFS-3 volumes.
The traditional upgrading of a shared VMFS-2 volume to VMFS-3 incurred downtime for all connected hosts.
Upgrade VMotion involved creating a VMFS-3 volume and migrating the VMs from a shared VMFS-2 volume to the new VMFS-3 volume without incurring downtime.
Data transfer occurs only in the ESX 3.0.1 host (disk to disk speeds). No network connectivity issues.
117
VI4 - Mod 3 - Slide
Storage VMotion was introduced in ESX 3.5.
This facilitates the migration of a running VM to new storage
VM stays on same host
VM disks may be individually placed
Storage type independent
Migration does not disturb VM
No downtime
Transparent to GOS and apps
Minimal performance impact
Storage VMotion
118
VI4 - Mod 3 - Slide
Storage VMotion Enhancements In VI4
The following enhancements have been made to the VI4 version of Storage VMotion:
GUI Support.
Leverages new features of VI4, including fast suspend/resume and Change Block Tracking.
Supports moving VMDKs from Thick to Thin formats & vice versa
Ability to migrate RDMs to VMDKs.
Ability to migrate RDMs to RDMs.
Support for FC, iSCSI & NAS.
Storage VMotion no longer requires 2 x memory.
No requirement to create a VMotion interface for Storage VMotion.
Ability to move an individual disk without moving the VM’s home.
119
VI4 - Mod 3 - Slide
VI4 Storage VMotion In Action
Source Destination
“Fast suspend/resume” VM to start running on new home and disks
4
Copy all remaining disk blocks
4
Copy VM home to new location
1
Delete original VM home and disks
5
Start changed block tracking
2
Pre-copy disk to destination
(multiple iterations)
3
120
VI4 - Mod 3 - Slide
New Features
Fast Suspend/Resume of VMs
This provides the ability to quickly transition between the source VM to the destination VM reliably and in a fast switching time. This is only necessary when migrating the .vmx file
Changed Block Tracking
Very much like how we handle memory with standard VMotion in that a bitmap of changed disk blocks is used rather than a bitmap of changed memory pages.
This means Storage VMotion no longer needs to snapshot the original VM and commit it to the destination VM so the Storage VMotion operation performs much faster.
Multiple iterations of the disk copy goes through, but each time the number of disk blocks that changed reduces, until eventually all disk blocks have been copied and we have a complete copy of the disk at the destination.
121
VI4 - Mod 3 - Slide
SVMotion Internals
SVMotion in ESX 4.0 use Fast Suspend Resume along with Change Block Tracking for VM Relocation.
Step 1 : Enable Change Block Tracking. This creates a memory bitmap which keep track of dirty blocks. These are references to all blocks which are changed by the guest OS.
Source VMFS
Destination VMFS
vmdk
vmkernel
read write
122
VI4 - Mod 3 - Slide
SVMotion Internals (ctd)
Step 2 : DiskLib (Data Mover) functionality is used to clone the disk. This uses the VMkernel data mover to create the disk on the destination VMFS.
Source VMFS
Destination VMFS
vmdk
vmkernel
readwrite
DiskLib Clone
vmdk
Data Mover
123
VI4 - Mod 3 - Slide
SVMotion Internals (ctd)
Step 3 : Pre-copy the changed blocks and keep writing them to the destination VMDK in multiple iterations, until a sufficiently small changed block is encountered. After this final block, no more pre-copy happens and the final block is merged with the destination disk.
Source VMFS
Destination VMFS
vmdk
vmkernel
vmdk
Extract Change Track Info
Write Changed Blocks
Query CBT
Final Block
124
VI4 - Mod 3 - Slide
SVMotion Internals (ctd)
Step 4 (Fast Suspend Resume) : Creates a destination VM.
Step 5 : Transfer memory and memory reservations.
Step 6 : Suspend the source VM and resume the destination VM.
These steps are only carried out if the migration involves the VM’s .vmx file.
Destination VMFS
vmdk
Destination VMFS
vmdk
Dest VM No memory
Destination VMFS
vmdk
Dest VM
Reservation
Destination VMFS
vmdk
Suspend and Remove
Source VMSource VM Source VM Resume
125
VI4 - Mod 3 - Slide
Storage VMotion – GUI Support
• Storage VMotion is still supported via the VI CLI 4.0 as well as the API, so customers wishing to use this method can continue to do so.
• The Change both host and datastore option is only available to powered off VMs.
For a non-passthru RDM, you can select to convert it to either Thin Provisioned or Thick when converting it to a VMDK, or you can leave it as a non-passthru RDM.
126
VI4 - Mod 3 - Slide
Storage VMotion – CLI (ctd)
# svmotion --interactive
Entering interactive mode. All other options and environment variables will be ignored.
Enter the VirtualCenter service url you wish to connect to (e.g. https://myvc.mycorp.com/sdk, or just myvc.mycorp.com): VC-Linked-Mode.vi40.vmware.com
Enter your username: Administrator
Enter your password: ********
Attempting to connect to https://VC-Linked-Mode.vi40.vmware.com/sdk.
Connected to server.
Enter the name of the datacenter: Embedded-ESX40
Enter the datastore path of the virtual machine (e.g. [datastore1] myvm/myvm.vmx): [CLAR_L52] W2K3SP2/W2K3SP2.vmx
Enter the name of the destination datastore: CLAR_L53
You can also move disks independently of the virtual machine. If you want the disks to stay with the virtual machine, then skip this step..
Would you like to individually place the disks (yes/no)? no
Performing Storage VMotion.
0% |----------------------------------------------------------------------------------------------------| 100%
##########
127
VI4 - Mod 3 - Slide
Limitations
The migration of Virtual Machines which have snapshots will not be supported at GA.
Currently the plan is to have this in K/L U2, a future release
The migration of Virtual Machines to a different host and a different datastore simultaneously is not yet supported.
No firm date for support of this feature yet.
128
VI4 - Mod 3 - Slide
Troubleshooting Storage VMotion
To assist in troubleshooting Storage VMotion, the source side vmware.log is copied to the destination as vmware-0.log.
To investigate destination power on failures, run a ‘tail’ command against the destination vmware.log files.
The proc node /proc/vmware/migration/history continues to exist in ESX 4 and provides very useful information on Storage VMotion operations as well as standard VMotion operations.
129
VI4 - Mod 3 - Slide
Storage VMotion Timeouts
There are also a number of tunable timeout values:
Downtime timeout
Failure: Source detected that destination failed to resume. Update fsr.maxSwitchoverSeconds (default 100 seconds) in the
VM’s .vmx file.
May be observed on Virtual Machines that have lots of virtual disks.
Data timeout
Failure: Timed out waiting for migration data. Update migration.dataTimeout (default 60 seconds ) in the VM’s .vmx
file.
May be observed when migrating from NFS to NFS on slow networks.
130
VI4 - Mod 3 - Slide
Lesson 6 Summary
Storage VMotion in VI4 no longer uses VM Snapshots
It uses new features like Fast Suspend/Resume of VMs Changed Block Tracking.
Storage VMotion in VI4 supports different storage technologies, e.g. FC, iSCSI, NAS.
Storage VMotion in VI4 supports format conversions during migration, e.g. thick, thin, RDM to VMDK.
Previous limitations have been removed:
Doubling of VM’s memory resources.
VMotion interface requirement.
Requirement to move VM’s home datstore.
131
VI4 - Mod 3 - Slide
Lesson 6 – Lab 5
Storage VMotion
Migrate a VM from one datastore to a different datastore using the new GUI enhancements.
132
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
133
VI4 - Mod 3 - Slide
Thin Provisioning Introduction
VMware thin provisioning enables Virtual Machines to utilize storage space on an as-needed basis, further reducing the cost of storage for virtual environments.
Thin provisioning virtualizes storage capacity.
Thin provisioning allows users to optimally but safely utilize available storage space by using advanced concepts such as over-allocation/over-committing of storage.
The VM thinks it has access to a large amount of storage, but the actual physical footprint is much smaller.
It is allocated/expanded on-demand by the VMFS3 driver if and when the guest OS needs it.
Alarms and reports that specifically track allocation versus current usage of storage capacity allow storage administrators to optimize the allocation of storage for virtual environments.
134
VI4 - Mod 3 - Slide
Thin Provisioning Introduction (ctd)
Thin Virtual Disks
Create VM/Reconfigure offers a choice to create thin disks.
Ability to convert disks to thin/thick during disk copy in clone/relocate/Storage VMotion.
Ability to inflate thin disks to thick.
Thin Provisioning Benefits
Reduces disk space usage and increase storage utilization.
Cascading impacts: performance, backup/recovery.
Control capacity through on-demand provisioning.
135
VI4 - Mod 3 - Slide
Problem Addressed By Thin Provisioning
Example: If a 500 GB VMDK is allocated to an application with only 100 GB of actual data, the other 400 GB has no data stored on it.
That unused capacity is still dedicated to that application and no other application can use it.
This means that the unused 400 GB is wasted storage.
Which means that it is also wasted money.
And even though all of the storage capacity may eventually be used, it could take years to do so.
This is a major problem when managing storage capacity and is often referred to as stranded storage, or allocated-but unused storage.
Thin Provisioning prevents stranded storage scenarios.
136
VI4 - Mod 3 - Slide
Layers Of Storage Indirection
LUN Provisioned at Array
VMFS Volume/Datastore Provisioned for ESX
Virtual Disk Provisioned for VM
137
VI4 - Mod 3 - Slide
Thin Provisioning Options
LUN Provisioned at Array
VMFS Volume/Datastore Provisioned for ESX
Virtual Disk Provisioned for VM
Thin Virtual Disk
Thin Provisioned LUN within the Array
40 GB
500 GB
200 GB
15 GB
500 GB
138
VI4 - Mod 3 - Slide
Thin Provisioning Operations
You can choose to deploy a thin provisioned disk during the following operations:
1. Virtual Machine Creation
2. Clone to Template
3. Clone Virtual Machine
4. Migrate Virtual Machine
139
VI4 - Mod 3 - Slide
Thin Provisioning & NFS
vSphere will now automatically allocate a thin provisioned disk if it discovers that the underlying storage is NFS.
140
VI4 - Mod 3 - Slide
Troubleshooting With ‘du’ & ‘stat’ Commands
To print out the number of disk blocks allocated in 512 byte units.
# stat /vmfs/volumes//*-flat.vmdk
If you multiply that number by 512 bytes, that is the amount of space utilized by the file.
# ls -l /vmfs/volumes/48d39951-19a5b934-67c3-0019993032e1/cormac_test_vm
-rw------- 1 root root 4294967296 Sep 29 15:18 cormac_test_vm-flat.vmdk
-rw------- 1 root root 422 Sep 29 15:18 cormac_test_vm.vmdk
-rw------- 1 root root 0 Sep 29 15:18 cormac_test_vm.vmsd
-rwxr-xr-x 1 root root 1697 Sep 29 15:18 cormac_test_vm.vmx
-rw------- 1 root root 269 Sep 29 15:18 cormac_test_vm.vmxf
# cd /vmfs/volumes/48d39951-19a5b934-67c3-0019993032e1/cormac_test_vm
# stat cormac_test_vm-flat.vmdk
File: "cormac_test_vm-flat.vmdk"
Size: 4294967296 Blocks: 0 IO Block: 131072 regular file
Device: eh/14d Inode: 25171716 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2008-09-29 15:18:37.000000000
Modify: 2008-09-29 15:18:37.000000000
Change: 2008-09-29 15:18:37.000000000
#
Alternatively, use:# du -sh /vmfs/volumes//*-flat.vmdk
Warning – there is no du command in ESXi, but there is a stat command.
141
VI4 - Mod 3 - Slide
Lesson 7 Summary
VI4 introduces complete support for thin disks.
This enables virtualization of storage.
Thin provisioning is available for many Virtual Machine operations, including creation, cloning and migration.
142
VI4 - Mod 3 - Slide
Lesson 7 – Lab 6
Thin Provisioning
Create a VM with thin provisioning
Observe the attributes of the VM’s disk files
Inflate the disk from thin to thick
Observe the new attributes of the VM’s disk files
143
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
144
VI4 - Mod 3 - Slide
Supported Disk Growth/Shrink Operations
VI4 introduces the following growth/shrink operations:
Grow VMFS volumes: yes
Grow RDM volumes: yes
Grow *.vmdk : yes
Shrink VMFS volumes: no
Shrink RDM volumes: yes
Shrink *.vmdk : no
145
VI4 - Mod 3 - Slide
Volume Grow/ Hot VMDK ExtendVolume Grow
VI4 allows dynamic expansion of a volume partition by adding capacity to a VMFS without disrupting running Virtual Machines.
Once the LUN backing the datastore has been grown (typically through an array management utility), Volume Grow expands the VMFS partition on the expanded LUN.
Historically, the only way to grow a VMFS volume was to use the extent-based approach. Volume Grow offers a different method of capacity growth.
The newly available space appears as a larger VMFS volume along with an associated grow event in vCenter.
Hot VMDK Extend
Hot extend is supported for VMFS flat virtual disks in persistent mode and without any Virtual Machine snapshots.
Used in conjunction with the new Volume Grow capability, the user has maximum flexibility in managing growing capacity in VI4.
146
VI4 - Mod 3 - Slide
Comparison: Volume Grow & Add Extent
Volume Grow Add Extent
Must power-off VMs No No
Can be done on newly-provisioned LUN
No Yes
Can be done on existing array-expanded LUN
Yes Yes (but not allowed through GUI)
Limits An extent can be grown any number of times, up to 2TB.
A datastore can have up to 32 extents, each up to 2TB.
Results in creation of new partition
No Yes
VM availability impact None, if datastore has only one extent.
Introduces dependency on first extent.
147
VI4 - Mod 3 - Slide
Here I am choosing the same device on which the VMFS is installed – there is currently 4GB free.
Volume Grow GUI Enhancements
This option selects to expand the VMFS using free space
on the current device
Notice that the current extent capacity is 1GB.
148
VI4 - Mod 3 - Slide
Volume Grow Messages - /var/log/vmkernel
Sep 18 14:04:09 cs-tse-f116 vmkernel: 2:21:59:52.522 cpu7:4110)LVM: ExpandVolume:6172: dev <naa.60060160432017003461b060f9f6da11:1> spaceToAdd <0>
Sep 18 14:04:09 cs-tse-f116 vmkernel: 2:21:59:52.522 cpu7:4110)LVM: ExpandVolume:6202: Using all available space (5368709120).
Sep 18 14:04:09 cs-tse-f116 vmkernel: 2:21:59:52.840 cpu7:4110)LVM: AddVolumeCapacity:6113: Successfully added space (0) on device naa.60060160432017003461b060f9f6da11:1 to volume 48d247da-b18fd17c-1da1-0019993032e1
149
VI4 - Mod 3 - Slide
VMFS Grow - Expansion Options
LUN Provisioned at Array
VMFS Volume/Datastore Provisioned for ESX
Virtual Disk Provisioned for VM
VMFS Volume Grow
Dynamic LUN Expansion
VMDK Hot Extend
150
VI4 - Mod 3 - Slide
Hot VMDK Extend
151
VI4 - Mod 3 - Slide
Hot VMDK Extend – vmware.log
Sep 18 14:18:58.085: vmx| DISKLIB-VMFS : "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA-flat.vmdk" : open successful (16) size = 2791729152, hd = 311315. Type 3
Sep 18 14:18:58.085: vmx| DISKLIB-DSCPTR: Opened [0]: "FalconStor-NSSVA-flat.vmdk" (0x10)
Sep 18 14:18:58.086: vmx| DISKLIB-LINK : Opened '/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk' (0x10): vmfs, 5452596 sectors / 2.6 GB.
Sep 18 14:18:58.086: vmx| DISKLIB-LIB : Opened "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk" (flags 0x10).
Sep 18 14:18:58.086: vmx| DISKLIB-LIB : Growing disk '/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk' : createType = vmfs
Sep 18 14:18:58.087: vmx| DISKLIB-LIB : Growing disk '/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk' : capacity = 5452596 sectors - 2.6 GB
Sep 18 14:18:58.087: vmx| DISKLIB-LIB : Growing disk '/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk' : new capacity = 7549748 sectors - 3.6 GB
Sep 18 14:18:58.464: vmx| DISKLIB-LINK : "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA.vmdk.dfgshkgrw-tmp" : creation successful.
Sep 18 14:18:58.680: vmx| DISKLIB-VMFS : "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA-flat.vmdk" : closed.
Sep 18 14:18:58.684: vmx| DISKLIB-VMFS : "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA-flat.vmdk" : open successful (25) size = 3865470976, hd = 0. Type 3
Sep 18 14:18:59.387: vmx| DISKLIB-DDB : "geometry.cylinders" = "469" (was "339")
Sep 18 14:18:59.618: vmx| DISKLIB-VMFS : "/vmfs/volumes/48d247dd-7971f45b-5ee4-0019993032e1/FalconStor-NSSVA/FalconStor-NSSVA-flat.vmdk" : closed.
152
VI4 - Mod 3 - Slide
Lesson 8 - Summary
VI4 introduces 2 new dynamic expansion of storage techniques
For VMFS volume, there is the Volume Grow technique
For Virtual machine Disk Files, there is the VMDK Hot Extend technique
153
VI4 - Mod 3 - Slide
Lesson 8 – Lab 7
(a) Volume Grow
Using an FC array, e.g. Clariion, grow the size of a underlying VMFS volume LUN.
Rescan the SAN.
Grow the VMFS volume to the new capacity – on the fly.
(b) VMDK Hot Extend
Use the Hot Extend mechanism to increase the size of a VMDK
From within the VM, dynamically grow the Guest OS file system to automatically use this new allocated disk space.
154
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
155
VI4 - Mod 3 - Slide
ESX 4.0 CLI
There have been a number of new storage commands introduced with ESX 4.0 as well as enhancements to the more traditional commands.
Some of these we have already observed in action:
esxcli
esxcfg-mpath / vicfg-mpath
esxcfg-volume / vicfg-volume
esxcfg-scsidevs / vicfg-scsidevs
esxcfg-rescan / vicfg-rescan
esxcfg-module / vicfg-module
vmkfstools
This topic will look at these commands in more detail.
156
VI4 - Mod 3 - Slide
New/Updated CLI Commands(1) : esxcfg-scsidevs
The esxcfg-vmhbadevs command has been replaced by the esxcfg-scsidevs command.
To display the old VMware Legacy identifiers (vml), use:
# esxcfg-scsidevs –u
To display Service Console devices:
# esxcfg-scsidevs –c
To display all logical devices on this host:
# esxcfg-scsidevs –l
To show the relationship between COS native devices (/dev/sd) and vmhba devices:
# esxcfg-scsidevs -m
The VI CLI 4.0 has an equivalent vicfg-scsidevs for ESXi.
157
VI4 - Mod 3 - Slide
esxcfg-scsidevs (ctd)
Sample output of esxcfg-scsidevs –l:
naa.600601604320170080d407794f10dd11
Device Type: Direct-Access
Size: 8192 MB
Display Name: DGC Fibre Channel Disk (naa.600601604320170080d407794f10dd11)
Plugin: NMP
Console Device: /dev/sdb
Devfs Path: /vmfs/devices/disks/naa.600601604320170080d407794f10dd11
Vendor: DGC Model: RAID 5 Revis: 0224
SCSI Level: 4 Is Pseudo: false Status: on
Is RDM Capable: true Is Removable: false
Is Local: false
Other Names:
vml.0200000000600601604320170080d407794f10dd11524149442035
Note that this is one of the few CLI commands which will
report the LUN size
158
VI4 - Mod 3 - Slide
New/Updated CLI Commands(2): esxcfg-rescan
You now have the ability to rescan based on whether devices were added or device were removed.
You can also rescan the current paths and not try to discover new ones.
# esxcfg-rescan -h
esxcfg-rescan <options> [adapter]
-a|--add Scan for only newly added devices.
-d|--delete Scan for only deleted devices.
-u|--update Scan existing paths only and update their state.
-h|--help Display this message.
The VI CLI 4.0 has an equivalent vicfg-rescan command for ESXi.
159
VI4 - Mod 3 - Slide
New/Updated CLI Commands(3): vmkfstools
The vmkfstools commands exists in the Service Console and VI CLI 4.0
Grow a VMFS:
vmkfstools –G
Inflate a VMDK from thin to thick:
vmkfstools –j
Import a thick VMDK to thin:
vmkfstools –i <src> -d thin
Import a thin VMDK to thick:
vmkfstools –i <src thin disk> -d zeroedthick
160
VI4 - Mod 3 - Slide
A host may have crashed during a volume operation
Oct 31 16:53:35 bs-pse-i143 vmkernel: 0:00:30:29.564 cpu3:4109)VC: RescanVolumes:656: Open volume 48c693ef-c7f30e18-6073-001a6467a6de (f530) will persist across rescan
Oct 31 16:53:35 bs-pse-i143 vmkernel: 0:00:30:29.610 cpu3:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (9 tries left)
Oct 31 16:53:36 bs-pse-i143 vmkernel: 0:00:30:30.630 cpu3:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (8 tries left)
Oct 31 16:53:37 bs-pse-i143 vmkernel: 0:00:30:31.650 cpu3:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (7 tries left)
Oct 31 16:53:38 bs-pse-i143 vmkernel: 0:00:30:32.670 cpu3:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (6 tries left)
Oct 31 16:53:39 bs-pse-i143 vmkernel: 0:00:30:33.690 cpu3:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (5 tries left)
Oct 31 16:53:40 bs-pse-i143 vmkernel: 0:00:30:34.710 cpu2:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (4 tries left)
Oct 31 16:53:41 bs-pse-i143 vmkernel: 0:00:30:35.730 cpu2:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (3 tries left)
Oct 31 16:53:42 bs-pse-i143 vmkernel: 0:00:30:36.750 cpu2:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (2 tries left)
Oct 31 16:53:43 bs-pse-i143 vmkernel: 0:00:30:37.770 cpu2:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (1 tries left)
Oct 31 16:53:44 bs-pse-i143 vmkernel: 0:00:30:38.790 cpu2:4109)LVM: OpenDevice:3723: Device <(mpx.vmhba1:C0:T1:L0:1, 3573783040), 47e8b8f9-d92da758-86c4-001a6467a6de> locked by 48e90aa5-6bccdad8-d017-001a6467a6dc at 1225385007187365 (0 tries left)
Oct 31 16:53:45 bs-pse-i143 vmkernel: 0:00:30:39.810 cpu2:4109)WARNING: LVM: OpenDevice:3777: Device mpx.vmhba1:C0:T1:L0:1 still locked. A host may have crashed during a volume operation. See vmkfstools -B command.
Oct 31 16:53:45 bs-pse-i143 vmkernel: 0:00:30:39.826 cpu2:4109)LVM: ProbeDeviceInt:5697: mpx.vmhba1:C0:T1:L0:1 => Lock was not free
161
VI4 - Mod 3 - Slide
A host may have crashed during a volume operation
# vdf
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc2 5044188 1696196 3091756 36% /
/dev/sda1 248895 50761 185284 22% /boot
/vmfs/devices 1288913559 01288913559 0% /vmfs/devices
# vmkfstools -B /vmfs/devices/disks/mpx.vmhba1\:C0\:T1\:L0:1
VMware ESX Question:
LVM lock on device mpx.vmhba1:C0:T1:L0:1 will be forcibly broken. Please consult vmkfstools or ESX documentation to understand the consequences of this.
Please ensure that multiple servers aren't accessing this device.
Continue to break lock?
0) Yes
1) No
Please choose a number [0-1]: 0
Successfully broke LVM device lock for /vmfs/devices/disks/mpx.vmhba1:C0:T1:L0:1
# esxcfg-rescan vmhba1
# vdf
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc2 5044188 1696408 3091544 36% /
/dev/sda1 248895 50761 185284 22% /boot
/vmfs/devices 1288913559 01288913559 0% /vmfs/devices
/vmfs/volumes/48c693ef-c7f30e18-6073-001a6467a6de
142606336 6316032 136290304 4% /vmfs/volumes/cos143
#
162
VI4 - Mod 3 - Slide
Lesson 9 - Summary
ESX/ESXi 4.0 introduces changes to some familiar CLI commands
esxcfg-mpath / vicfg-mpath
esxcfg-rescan / vicfg-rescan
vmkfstools
esxcfg-module / vicfg-module
ESX/ESXi 4.0 also introduces some new CLI command
esxcli
esxcfg-volume / vicfg-volume
esxcfg-scsidevs / vicfg-scsidevs
ESX 4.0 deprecates some CLI commands
esxcfg-vmhbadevs
163
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
164
VI4 - Mod 3 - Slide
Paravirtualization
Paravirtualization is an enhancement of virtualization technology in which a Guest OS has some awareness that it is running inside a virtual machine rather than on physical hardware.
The Guest OS is tailored to run on top of the virtual machine monitor (VMM).
At its most basic level, paravirtualization eliminates the need to trap privileged instructions as it uses hypercalls to request that the underlying hypervisor execute those privileged instructions.
Handling unexpected or unallowable conditions via trapping can be time-consuming and can impact performance.
By removing this reliance on trapping, Paravirtualization minimizes overhead and optimizes performance of the Guest OS.
165
VI4 - Mod 3 - Slide
Paravirtualized SCSI driver
Our current I/O stack (under Windows) looks like this:
1. Application
2. Guest OS
3. Guest Device Driver for the virtual device
(specifically, LSI Logic driver)
4. ESX layer (e.g. VMM, virtual device, VMFS)
5. Physical Device Driver
(e.g. Qlogic or Emulex for FC, LSI for on-motherboard SCSI, etc.)
At layer 3 & 5, these device drivers are outside of VMware control.
Adding our own paravirtualized device driver eliminates layer 3.
166
VI4 - Mod 3 - Slide
Paravirtualized SCSI driver (ctd)
• The purpose of a paravirtualized SCSI driver is to improve the CPU efficiency and the I/O latency of storage operations for an application running in a VM.
• There is a paravirtual SCSI driver for Linux & Windows
167
VI4 - Mod 3 - Slide
Paravirtualized SCSI Limitations
Paravirtual SCSI adapters will be supported on the following guest operating systems:
Windows 2008
Windows 2003
Red Hat Linux (RHEL) 5
The following features are not supported with Paravirtual SCSI adapters:
Boot disks
Record/Replay
Fault Tolerance
MSCS Clustering
168
VI4 - Mod 3 - Slide
Lesson 10 - Summary
ESX/ESXi 4.0 introduces a new paravirtualized SCSI device driver.
The new paravirtualized SCSI device driver improves I/O performance of the applications running in the Guest OS.
It is not supported in ESX/ESXi 4.0 RC release notes, but may well have some sort of support at GA.
169
VI4 - Mod 3 - Slide
Module 3 Lessons
Lesson 1 - Pluggable Storage Architecture
Lesson 2 - SCSI-3 & MSCS Support
Lesson 3 - iSCSI Enhancements
Lesson 4 - Storage Administration & Reporting
Lesson 5 - Snapshot Volumes & Resignaturing
Lesson 6 - Storage VMotion
Lesson 7 - Thin Provisioning
Lesson 8 - Volume Grow / Hot VMDK Extend
Lesson 9 - Storage CLI Enhancements
Lesson 10 – Paravirtualized SCSI Driver
Lesson 11 – Service Console Storage
170
VI4 - Mod 3 - Slide
COS File System (ctd)
The new root/boot disk layout of ESX 4.0 may confuse many customers. For instance, this ESX has a single local disk.
# fdisk –lu /dev/cciss/c0d0
Disk /dev/cciss/c0d0: 146.7 GB, 146778685440 bytes
255 heads, 63 sectors/track, 17844 cylinders, total 286677120 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 63 514079 257008+ 83 Linux
/dev/cciss/c0d0p2 514080 738989 112455 fc Unknown
/dev/cciss/c0d0p3 738990 286663859 142962435 5 Extended
/dev/cciss/c0d0p5 739053 286663859 142962403+ fb Unknown
Notice that there is only a single Linux partition – this is for /boot. The rest of the disk is taken up by vmkcore and VMFS. So where is the Service Console’s root (/) partition?
171
VI4 - Mod 3 - Slide
COS File System (ctd)
# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 5036316 1634040 3146444 35% /
/dev/cciss/c0d0p1 248911 50781 185279 22% /boot
A df command output reveals that there is /dev/sda device on which the Service Console’s partition is mounted. How is this possible if I only have a single local disk? Where has this /dev/sda come from?
The /dev/sda device can be though of as a virtual storage device presented to the Service Console in much the same way that a virtual disk is presented to a Virtual Machine and appears as a local volume.
172
VI4 - Mod 3 - Slide
COS File System (ctd)
How can I tell what devices are in fact virtual disks?
# vsd -l
vsa0:0:0 /dev/sda
Format is vsaC:T:L
How can I tell which virtual machine disk file is being used to back this virtual disk?
# vsd -g
/vmfs/volumes/48ca2965-7d4a93bc-7228-001a4bbe2f02/esxconsole-48ca28be-fd63-23c8-ba1c-001a4bbe2f00/esxconsole.vmdk
The root file system is always backed by a vmdk with a name of /vmfs/volumes/*/esxconsole-*/esxconsole.vmdk
173
VI4 - Mod 3 - Slide
COS File Systems (ctd)
# fdisk -lu
Disk /dev/cciss/c0d0: 146.7 GB, 146778685440 bytes
255 heads, 32 sectors/track, 35132 cylinders, total 286677120 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 32 514079 257024 83 Linux
/dev/cciss/c0d0p2 514080 734399 110160 fc Unknown
/dev/cciss/c0d0p3 734400 286677119 142971360 5 Extended
/dev/cciss/c0d0p5 734432 286677119 142971344 fb Unknown
Disk /dev/sda: 6084 MB, 6084886528 bytes
255 heads, 63 sectors/track, 739 cylinders, total 11884544 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/sda1 63 1638629 819283+ 82 Linux swap / Solaris
/dev/sda2 1638630 11872034 5116702+ 83 Linux
174
VI4 - Mod 3 - Slide
Lesson 11 - Summary
ESX/ESXi 4.0 introduces some changes to the Service Console Storage subsystem.
The concept of virtual disks for the Service Console is introduced.
175
VI4 - Mod 3 - Slide
Miscellaneous Storage Features In VI4
Storage General
The number of LUNs that can be presented to the ESX 4.0 server is still 256.
VMFS
The maximum extent volume size in VI4 is still 2TB.
Maximum number of extents is still 32, so maximum volume size is still 64TB.
We are still using VMFS3, not VMFS 4 (although the version has increased to 3.33).
There is still no file system checker.
iSCSI Enhancements
10 GbE iSCSI Initiator – iSCSI over a 10GbE interface is supported. First introduced in ESX/ESXi 3.5 u2 & extended back to ESX/ESXi 3.5 u1.
176
VI4 - Mod 3 - Slide
Miscellaneous (ctd)
NFS Enhancements
IPv6 support (experimental)
Support for up to 64 NFS volumes (the old limit was 32)
10 GbE NFS Support – NFS over a 10GbE interface is supported. First introduced in ESX/ESXi 3.5 u2
FC Enhancements
Support for 8Gb Fibre Channel First introduced in ESX/ESXi 3.5 u2
Support for FC over Ethernet (FCoE)
Guest OS Enhancements
A new SCSI Hardware device for Guest OS & Paravirtualized SCSI driver to improvement performance of storage operations.
177
VI4 - Mod 3 - Slide
Miscellaneous (ctd)
ESX 3.xboot time
LUN selection – which sd
device represents an
iSCSI disk and which
represents an FC disk?
178
VI4 - Mod 3 - Slide
Miscellaneous (ctd)
ESX 4.0boot time
LUN selection.Hopefully this will address
incorrect LUN selections
during install/upgrade.