Virtual Systems Monitoring and Capacity Planning – An Update Demand Technology Software, Inc. Virtual Systems Monitoring and Capacity Planning – An Update Demand Technology Windows Symposium CMG – 12/05/2005 Phil Henninge Demand Technology Software 1020 Eighth Avenue South, Suite 6, Naples, FL 34102 phone: (239) 261-8945 fax: (239) 261-5456 e-mail: [email protected]http://www.demandtech.com
47
Embed
Virtual Systems Monitoring and Capacity Planning – An Update Demand Technology Software, Inc. Virtual Systems Monitoring and Capacity Planning – An Update.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Virtual Systems Monitoring and Capacity Planning – An UpdateDemand Technology Software, Inc.
Virtual Systems Monitoring and Capacity
Planning – An UpdateDemand Technology Windows Symposium CMG – 12/05/2005
At the system level we look at the system resources CPU Utilization Memory Utilization (memory consumption and paging) Disk Utilization Network Utilization (NIC traffic and topology)
At the software level we look at specific objects. Process (what are the VMWare and Microsoft specific processes) Network Interface (what virtual network adapters are defined) Other Performance Objects
Virtual Processors Object (Virtual PC) One instance for each Virtual Machine
Guest External Interrupts Number of virtual interrupts delivered to guest OS. Host-to-VMM Context Switches Number of context switches between Windows and the guest (VMM) context. Cumulative Guest Run Time The guest run time represents the number of microseconds the guest processor has run on a host processor. With the default scaling, the graph represents guest run time percentage. VMM Exceptions Number of processor exceptions handled by the VMM.
HALT/Idle Loop Measurement Anomaly When a machine is idle, its operating system will either issue a HALT
instruction or repeatedly execute an idle loop of NOP instructions Idle loop is the default for most server machines Idle loop is a function contained in hal.dll
When a virtual machine executes an idle loop, it is actively executing instructions which run on the host machine’s physical processor. Thus performance tools in the guest machine will show inactivity, while the host machine will appear fully utilized.
Virtual machines running Windows operating systems having the wrong HAL (Hardware Abstraction Layer) installed will make the guest operating system spin in its idle loop, instead of HALTing when there is nothing else to do.
Sizing destination servers requires first understanding the performance of the applications running on the source servers.
The VM Host machine must contain sufficient capacity (Processor, Memory, Disk and Network) to handle the peak loads of guest machines accumulate measurement data over long term periods that
include seasonal peaks compute Peak:Average ratios and understand when peak
periods occur to ensure they do not overlap on the same host compute 90-95th percentiles
The processor requirements of a source server should not exceed the processor capacity available to a virtual machine on the destination server. Normalize based on MHz
CPU requirements = number of CPUs x CPU speed x CPU utilization
The % Processor Time for all virtual machines running on a destination server should be < 90 % of the available CPU capacity 10% reserved for the host OS and I/O for virtual machine
The total amount configured for all virtual machines cannot exceed the size of physical RAMGuest Memory = sizeof(RAM) – Available Bytes (95th percentile)
Every virtual machine requires an additional 32 MB of physical memory
The host operating system requires exclusive use of at least 384 MB of memory.
The following are best practices for performance optimization on virtual hard disks: Use a hard disk solution that allows fast access, such as a locally-
attached SCSI hard disk, RAID, or SAN. Put each virtual hard disk on a dedicated volume, SCSI hard disk, RAID,
or SAN disk. It is easiest to put virtual hard disks together with their associated virtual machine configuration files on a RAID or SAN because this keeps everything in one place.
Reduce disk fragmentation. As a dynamically expanding virtual hard disk increases in size, it becomes increasingly fragmented. You can defragment the host operating system to make the virtual hard disk more contiguous. If disk performance is important, consider doing this. Fixed size virtual hard disks are allocated a contiguous block of reserved space on the physical hard disk. Therefore, there is no overhead created by the growing disk.
Compact the virtual hard disks to create more physical disk space.
Windows keeps track of time by counting timer interrupts or timer ticks. When the operating system starts up, it reads the current time to the nearest second from the computer's battery-backed (CMOS) real time clock or queries a network time server to obtain a more precise time.
To update the time from that point on, the operating system sets up one of the computer's hardware timekeeping devices to interrupt periodically at a known rate (say, 100-200 times per second).
This is timekeeping mechanism is known either as the periodic interrupt or the quantum in Windows.
Using a hardware interrupt to track time leads to problems in the guest virtual machine: At the moment a virtual machine should
generate a timer interrupt, it may not actually be running. In fact, the virtual machine may not get a chance to run again until it has accumulated a backlog of many timer interrupts.
Timer interrupts queued up for a single timer device cause a scalability issue as more and more virtual machines are run on the same physical machine.
What does this mean for Windows analysts?“Microsoft Windows has an additional time measurement feature accessed through the QueryPerformanceCounter system call. This name is a misnomer, since the call never accesses the CPU's performance counter registers. Instead, it reads one of the timer devices that have a counter, allowing time measurement with a finer granularity than the interrupt-counting system time of day clock.”
Description - This counter type shows the active time of a component as a percentage of the total elapsed time of the sample interval. It measures time in units of 100 nanoseconds. Counters of this type are designed to measure the activity of one component at a time.
Formula - (N1 - N0) / (D1 - D0) x 100, where the denominator (D) represents the total elapsed time of the sample interval, and the numerator (N) represents the portions of the sample interval during which the monitored components were active.
Let’s Look at another counter PERF_PRECISION_100NS_TIMER
Description - This counter type shows a value that consists of two counter values: the count of the elapsed time of the event being monitored, and the "clock" time from a private timer in the same units. It measures time in 100 nanosecond units. This counter type differs from other counter timers in that the clock tick value accompanies the counter value eliminating any possible difference due to latency from the function call. Precision counter types are used when standard system timers are not precise enough for accurate readings.
Formula - Nx - N0 / D1 - D0, where the numerator (N) represents the counter value, and the denominator (D) is the value of the private timer. The private timer has the same frequency as the 100 nanosecond timer.