-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 1
Allscripts Enterprise
VMware Best Practices
Production Database Server
Last Updated 1:00 PM, January 7, 2013
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 2
OVERVIEW
Virtualization of the Allscripts Enterprise EHR Production
Database Server is
supported provided that a predictable, well configured
environment can be verified
and maintained. To that end, Allscripts only supports production
database
virtualization if the below best practices and configuration
guidelines for optimal
performance are followed.
The below configuration settings, especially the hardware
options, should be taken
as recommendations for best overall performance not absolute
requirements and
need to be balanced with available physical assets and technical
resources.
SOFTWARE
To meet the technical & performance requirements of the
Enterprise EHR and give
the customers the best user experience possible, Allscripts
supports virtualization of a production database server environment
only if VMware vSphere 5.0 or higher is
used.
HARDWARE
CPU Use processors that support Hardware-Assisted
Virtualization, specifically: CPU: VT-x or AMD-V Memory: Intel EPT
or AMD RVI I/O: VT-d or AMD-Vi (optional)
Networking Use NICs that support the following options: Checksum
offload TCP Segmentation Offload (TSO) Ability to handle 64-bit DMA
Addresses Ability to handle multiple Scatter Gather elements per TX
frame Jumbo Frames (JF) Large Receive Offload (LRO) If using 10GB
NICs:
o NetQueue
o Single-Port NICs should use PCIe x8 (or higher) or PCI-X 266
bus architecture
o Dual-Port NICs should use PCIe x16 (or higher) bus
architecture
Storage Use hardware that supports VMware vStorage APIs for
Array Integration (VAAI)
Use fully redundant Storage Network (NICs, HBAs, Switches,
Front-End Storage Ports, etc.)
Enable Read/Write Caching on Storage
Server BIOS Settings:
Use latest version available
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 3
Enable Turbo Boost Enable Hyper-Threading Disable Node
Interleaving (Enable NUMA) Enable ALL Hardware-Assisted
Virtualization Features (see CPU section above) Disable Cache
Prefetching Mechanisms Disable unused hardware (see Recommendations
section below)
HOST CONFIGURATION
Disconnect and/or Disable ALL unused and unnecessary system
devices including:
Floppy Drivers COM Ports LPT Ports CD-ROM Drives USB Adapters
Network Interfaces Storage Controllers
NOTE: Disabling some devices can be complicated and may cause
other problems,
so thorough testing of specific changes is recommended.
Use separate virtual switches and physical network adapters for
host management
(VMkernel) and Virtual Machine networks.
Use a single vSwitch to optimize internal communication between
Enterprise EHR
VMs.
Figure 1.1: ESXi Networking
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 4
Virtualization technology has resource overhead requirements
needed to manage the
VMs therefore leave at least 4 GB RAM for the physical host.
For hosts supporting Enterprise EHR VMs, set the Power Policy
Option to High
Performance.
Figure 1.2: ESXi Power Management Settings
For best performance and availability, the use of Host Clusters
with HA and DRS are
recommended. For DRS, it is recommended to use at least the
Partially Automated settings. For production-level systems, be
cautious about using the Fully Automated settings as it may cause
undesired migrations of the VMs.
Figure 1.3: Cluster Settings
Have the VMs contained in a Resource Pool with proper resource
reservations.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 5
VMs should be located on optimized shared storage. The
optimizations include using
multiple HBAs, high-speed disks, high-speed uncongested data
networks.
The settings of the HBAs in your ESXi hosts may need to be
adjusted to optimize
their performance. In general, the default settings should be
used unless changes
are recommended by the documentation for your specific SAN
storage and switches.
VMware recommends changing the Disk.SchedNumReqOutstanding
Setting on your
ESXi hosts to match the Maximum Queue Depth of the HBAs. QLogic
has a default
Queue Depth of 32 and Emulex uses 30. Reference VMware knowledge
base article
1267. Evaluating your specific environment is recommended. Based
on the results
of your testing, you may see a benefit by increasing the values
to 64.
Figure 1.4: ESXi Software Advanced Settings
VIRTUAL MACHINE CONFIGURATION
Virtual machines (VMs) must meet normal Enterprise EHR
configuration standards OS level, Service Packs, Hot Fixes,
Application versions, etc.
The latest version of VMware Tools must be installed in the VMs.
The VMware Tools
package provides optimized Device Drivers and management
features that improve
performance and reliability.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 6
Disconnect and/or Disable ALL unused and unnecessary system
devices including:
Floppy Drivers COM Ports LPT Ports CD-ROM Drives USB
Adapters
Figure 1.6: Virtual Machine BIOS
VMs should use only version 8 virtual hardware; specifically:
VMXNET 3 network
adapter and Paravirtual SCSI Controller.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 7
Figure 1.7: Virtual Machine Hardware
For I/O intensive VMs, especially SQL Servers, Allscripts
recommends spreading the
disk I/O across 3 or 4 Paravirtual SCSI controllers (see above
screenshot).
When VMs with more than eight vCPUs are used Virtual NUMA is
enabled, so make
sure that the total CPU count of the VM is a multiple of the
cores per NUMA node on
the physical server. (NOTE: Some multi-core processors have NUMA
node sizes
that are different than the number of cores per socket. For
example, some 12-core
processors have two six-core NUMA nodes per processor.)
Figure 1.8: Virtual NUMA Settings
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 8
In systems with under-committed resources, the ESXi CPU
scheduler spreads load
across all sockets by default (if NUMA is disabled). For VMs
that exhibit significant
data sharing between CPUs (aka they share cache), you can force
the virtual CPUs to
always share the same core. Change the VMs .vmx configuration
file: sched.cpu.vsmpConsolidate=TRUE. If NUMA is enabled, the CPU
scheduler restricts the CPUs to the same socket.
EntepriseEHR Virtual Machines should always be configured using
Thick Provisioned Eager-zeroed disks.
Figure 1.9: Virtual Disk Thick Provision Eager Zeroed
GUEST OS SETTINGS
The following settings are recommended for optimal
performance:
Disable Screensavers and Windows animation in ALL VMs.
Disable IE Enhanced Security Configuration
Disable User Account Control
Disable Write Debugging Information on System Failure
Set Power Plan to High Performance
Disabled Scheduled Tasks:
\Microsoft\Windows\Defrag\ScheduledDefrag
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 9
\Microsoft\Windows\Registry\RegIdleBackup
\Microsoft\Windows\Time Synchronization\SynchronizeTime
STORAGE
The storage requirements and design for the SQL Database Server
used for
Enterprise EHR Applications have numerous caveats that are
dependent on the
unique characteristics a given customers environment. First a
brief overview of the storage architecture supported by VMware is
needed.
The operating system, applications and user data of a virtual
machine are kept in
one or more virtual SCSI disks. These virtual disk files (or
VMDKs) are typically
maintained in a VMFS datastore connected to a physical storage
subsystem Direct Attached Storage in a host, Fibre Channel SAN,
iSCSI SAN or NAS. VMware also
supports the use of Raw Device Mapping (RDM) which allows the VM
to have direct
access to a LUN on the storage (Fibre Channel or ISCSI
only).
In vSphere 5.0, a VMFS datastore can be a maximum of 64TB in
size but any single
VMDK file can only be 2TB minus 512 bytes. RDMs in physical
compatibility mode
can be up to 64TB in size.
For overall best performance, Allscripts recommends using only
RDMs (physical
compatibility mode) for the LUNs storing the Enterprise EHR
Application SQL data for
the following reasons:
RDMs allow the use of SAN-based snapshots and/or copies.
RDMs are required if you are leveraging Microsoft Failover
Clustering that need shared volumes.
Individual RDMs can be up to 64TB in size, so the growth of your
database environment can be easily accommodated.
RDMs are easier to use for migrations from physical to virtual
systems.
VMOTION
As stated by VMware:
Consider using a 10GbE vMotion network. Using a 10GbE network in
place of a 1GbE network for vMotion will result in significant
improvements in vMotion
performance. When using very large virtual machines (for
example, 64GB or
more), consider using multiple 10GbE network adaptors for
vMotion to further
improve vMotion performance.
When configuring resource pools, plan to leave at least 10% of
the CPU capacity unreserved. CPU reservations that fully commit the
capacity of the cluster can
prevent DRS from migrating virtual machines between hosts.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 10
When using the multiplenetwork adaptor feature, configure all
the vMotion vmnics under one vSwitch and create one vMotion vmknic
for each vmnic. In the
vmknic properties, configure each vmknic to leverage a different
vmnic as its
active vmnic, with the rest marked as standby. This way, if any
of the vMotion
vmnics become disconnected or fail, vMotion will transparently
switch over to one
of the standby vmnics. When all your vmnics are functional,
though, each vmknic
will route traffic over its assigned, dedicated vmnic.
TROUBLESHOOTING
Virtual Machine
A couple for basic items to validate for a VM are that VMware
Tools are indeed
installed & running and that it is a part of a HA
cluster.
Figure 1.10: Virtual Machine General Settings
The summary screen of a VM gives a basic summary of its
performance including
CPU and Memory usage. If Consumed Host CPU or Active Guest
Memory is sustained
at a level close the VMs configured quantity then further
investigation is warranted.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 11
Figure 1.11: Virtual Machine Resource Consumption
The Resource Allocation Tab is graphical view of the VMs
resource utilization.
Figure 1.12: Virtual Machine Resource Consumption
The Performance Tab of a VM provides real-time and historical
data about the usage
of all resources CPU, Memory, Disk & Network. The default
view provides a 1 Day Summary which gives you a good overview of
the VMs health. At the bottom of the default view is a 1 Day
Summary of the host the VM resides on for comparison. You
should develop baseline numbers for each different type of
server that you manage
for reference so you can better identify abnormal values.
NOTE: Unlike on physical servers, high CPU utilization (70% -
80%) in a virtual
server is normal and desired. High Memory utilization is not an
issue as long as it is
not causing the Guest OS to page the memory contents you have to
balance good resource utilization with good performance.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 12
ESXi Host
The summary screen of a host gives a basic summary of its
performance.
Figure 1.13: ESXi Host Summary Tab
The Performance tab of a host is a good starting point for
reviewing system
performance given a desired time range.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 13
Figure 1.14: ESXi Host Performance Tab
By using the Hardware Status Tab, the hosts physical resources
can be reviewed for issues.
Figure 1.15: ESXi Host Hardware Status Tab
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 14
DRS and/or HA Cluster
At the cluster level, the Hosts tab gives a good summary view of
the resource
consumption.
Figure 1.16: vSphere Cluster Hosts Tab
If a DRS Cluster is set to Fully Automated, monitor the value of
Total Migrations using vMotion. A high number may indicate a
performance hit due to atypically high
numbers of VM migrations because the cluster is attempting to
balance its resources.
Figure 1.17: vSphere Cluster Summary Tab
Datastores
The Datastores tab of the vSphere Datacenter can provide an
overview of the space
capacity and consumption. VMwares best practice recommendation
is to limit each datastore to 80% utilization.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 15
Figure 1.18: vSphere Datacenter Datastores Tab
NOTE: For further troubleshooting guidance, please refer to
VMwares website and documentation.
ADVANCED TROUBLESHOOTING
Check for Resource Pool CPU Saturation
Select a Resource Pool; Use the Summary Tab to determine the CPU
limit:
Figure 1.19: Performance Troubleshooting Resource Pool CPU
Limit
Select the Performance Tab; Select the Advanced option; Switch
view to CPU; Select
Usage in MHz Counter; Select all CPU objects.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 16
Figure 1.20: Performance Troubleshooting Resource Pool CPU
Saturation
Compare the Usage in MHz value to the CPU Limit setting on the
Resource Pool. If
the values are close the pool may be experiencing CPU saturation
Additional resources should be allocated to the pool.
If the performance problem is specific to a VM in the Resource
Pool, use that VM in
the following steps. If not, repeat the steps for all the VMs in
the Resource Pool.
Select the VM; Select the Performance Tab; Select the Advanced
option; Switch view
to CPU; Select Usage in MHz Counter for the VM object.
If the Average value is greater than 85% and peaks above 90-95%,
then CPU
Saturation is an issue.
Figure 1.21: Performance Troubleshooting VM CPU Ready
Check for an Overloaded Storage Device
Select a Host; Select the Performance Tab; Select the Advanced
option; Switch view
to Disk; Select Commands Terminated Counter; Select all
Datastore objects.
Any value other than zero indicates an issue with the storage
device.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 17
Figure 1.22: Performance Troubleshooting Disk Saturation
Using ESXTOP
ESXTOP is a command line utility used to get real-time
performance statistics of a
given ESXi Host. You must connect to the host using SSH (use
PuTTY or other SSH
friendly client); however, with VMware ESXi 5.0 or later, SSH is
disabled by default
and must be manually enabled.
Select a host; Select the Configuration Tab; Under Software,
Select Security Profile;
Under Services, Select Properties; Select SSH, Select the
Options Button; Under
Services Commands, Select Start to enable SSH; Select OK.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 18
Figure 1.23: ESXi Host Configuration Enabling SSH
Once SSH has been started, you can connect to the host and run
ESXTOP.
ESXTOP CPU
NOTE: All ESXTOP Key commands are case sensitive.
The starting screen for ESXTOP is the CPU utilization panel. You
can press V to show only VMs instead of all processes.
Figure 1.24: ESXTOP CPU Utilization
Examine PCPU UTIL(%) line for an unequal load across processor
cores with some at
saturation and some remaining near idle. This would indicate
applications within the
VM utilizing all of the cores provided to them.
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 19
Examine the %RDY field for the percentage of time that a virtual
machine was ready
but could not get scheduled to run on a physical CPU. This value
should remain
below 5%. Anything greater indicates a problem at the host level
- such as not
enough resources available.
Examine the %USED field for the percentage of physical CPU
resources used by a
vCPU. If the physical CPUs are running near or at full capacity
then ensure that the
CPU utilization per vCPU is less than 80%.
ESXTOP Memory
To access the Memory utilization panel press m. You can press V
to show only VMs instead of all processes.
Figure 1.25: ESXTOP Memory Utilization
Examine the MEMSZ field for the amount of physical memory
allocated to the VM.
Examine the SZTGT field for the amount of memory the ESXi
VMkernel wants to
allocate to the VM.
Examine the SWCUR field for the amount of memory in Megabytes
currently being
swapped. This value should always be zero to maintain optimal
performance.
ESXTOP Network
To access the network utilization panel press n.
Figure 1.26: ESXTOP Network Utilization
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 20
Examine the %DRPTX and DRPRX fields indicate dropped packets. If
high values
dropped packets are consistent, thoroughly review the network
configuration of all
VMs and especially the hosts.
ESXTOP Storage Adapters
To access the storage adapter utilization panel press d. Press f
to add fields; Press j to add Error Stats.
Figure 1.27: ESXTOP Storage Adapter Utilization
Ideally the the DAVG/cmd (device latency) & GAVG/cmd (VM
latency) fields should
be 5ms or less; values greater than 20ms may indicate a
bottleneck at the switch or
SAN. The KAVG/cmd field should always be zero high values
indicate an issue with a device driver and/or with device queue
depth. Examine the FCMDs/s field for any
failed commands which may indicate queue saturation or hardware
issues.
ESXTOP SCSI Queue Depth
To access the disk device utilization panel press u.
Figure 1.28: ESXTOP Disk Device Utilization
The ACTV field is the current commands in queue; a metric of
less than 20 is
excellent. The QUED field is commands waiting to process; any
value over zero is
unhealthy.
ESXTOP Virtual Machine Storage
To access the storage adapter utilization panel press v.
Figure 1.29: ESXTOP Virtual Machine Storage Utilization
-
Allscripts Enterprise
This page contains Allscripts proprietary information and is not
to be duplicated or disclosed to unauthorized persons. 21
Examine the LAT/rd & LAT/wr fields for values greater than
5ms which may indicate
a disk configuration issue.
RESOURCES
VMware Web Site:
http://www.vmware.com
VMware vSphere 5.0 Documentation:
http://www.vmware.com/support/pubs/vsphere-esxi-vcenter-server-pubs.html
Performance Best Practices for VMware vSphere 5.0:
http://www.vmware.com/resources/techresources/10220
VMware vSphere vMotion Architecture, Performance and Best
Practices in VMware
vSphere 5
http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf
VMware vSphere 5.0 Troubleshooting Guide:
http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vsphere-esxi-
vcenter-server-501-troubleshooting-guide.pdf