Top Banner
Virtualization CS623 11/8/2006 Caution: Still a new topic for me as well. Note: these slides draw, sometimes verbatim, on the papers cited on the next slide.
50
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Slides on Virtualization and Xen.

Virtualization

CS62311/8/2006

Caution: Still a new topic for me as well.Note: these slides draw, sometimes verbatim, on the

papers cited on the next slide.

Page 2: Slides on Virtualization and Xen.

Xen and the Art of VirtualizationPaul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho,

Rolf Neugebauery, Ian Pratt, Andrew Wareld, SOSP 2003.

Page 3: Slides on Virtualization and Xen.

Overview

• What is virtualization?

• Why would you do it?

• Why is it important?

Page 4: Slides on Virtualization and Xen.

What is virtualization?

• Used to present the illusion of many smaller virtual machines (VMs) each running a separate operating system interface.– Run multiple XPs on XP.– Run LINUX, Solaris, XP on XP.– Run XP on LINUX.– Etc.

Page 5: Slides on Virtualization and Xen.
Page 6: Slides on Virtualization and Xen.

Why would you do it?

• Single user: You have an XP box and you want to run LINUX.

• Single user: You don’t trust security of one OS or of some applications and you’d like to “wall it off.”

• Miroslav Ponec: Run linux on new laptop in vmware to avoid driver problems.

• Enterprise manager: You have lots of boxes that sit idle a lot of the time. If you multiplex you can save hardware, etc. costs.

• Shared grid/resource: host multiple (untrusted?) applications and servers on a shared machine.

Page 7: Slides on Virtualization and Xen.

Why VMs?

• Number of ways to build a system to host multiple applications and servers on a shared machine.– Deploy hosts running standard OS and allow users to install files

and start processes, protection between processes provided by standard OS techniques.

• System administration challenging due to complex configuration interaction

• No adequate support for performance isolation: scheduling priority, memory demand, network traffic and disk accesses of one process impact the performance of others.

– Possible solution: retrofit support for performance isolation to the operating system.

• Hard to ensure that all resource usage is accounted to the correct process– complex interactions due to buffer cache or page replacement algorithms.

Page 8: Slides on Virtualization and Xen.

Flashback!

• Heck, we had trouble getting the Operating System to properly account for resource usage of different threads in one server.

Page 9: Slides on Virtualization and Xen.

VM approach

• Multiplex physical resources at the granularity of an entire operating system and provide performance isolation.

• Price paid:– More heavyweight in terms of initialization

and resource consumption.– Xen: “For target of up to 100 hosted OS

instances, price worth paying. Enhanced flexibility, avoid configuration interactions (e.g. Windows registry).

Page 10: Slides on Virtualization and Xen.

A little hype from the media …

Page 11: Slides on Virtualization and Xen.

eweek on Virtualization 11/25/05

• IT departments are doing this to try to find "ways to use the newest in technology (processors, storage, memory, communications, and software) to improve: the application environment by increasing performance; optimizing processor utilization through workload management, scalability and reliability; increasing organizational efficiency by reducing costs of hardware, software and staff; and reducing both the number and the impact of system outages regardless of the underlying reason," said Kusnetzky.

• At a recent Gartner Symposium/ITxpo, Gartner Inc. vice president John Enck called virtualization a "megatrend."

• "We see virtualization being extremely important across all server types" and "virtualization is the best tool you have right now in the market to increase efficiency and drive up the utilization of your servers," said Enck.

• • What all this boils down to is that virtualization should make today's more

powerful computers more productive while simultaneously making them easier and cheaper to manage.

• The trick is how to make this happen.

Page 12: Slides on Virtualization and Xen.

ComputerWorld, 11/21/2005• NOVEMBER 21, 2005 (IDG NEWS SERVICE) - A recent survey of 100 IT executives predicts

that IT spending will decrease slightly in 2006 as more businesses worry about global economic conditions, but security software and enterprise IT upgrades remain top concerns.

• Macroeconomic factors such as high oil prices and a devastating hurricane season in the U.S. have caused 40% of the executives surveyed by Goldman, Sachs & Co. to consider reducing their 2006 IT budgets, according to survey results released Friday. Most executives, 52%, believe IT spending will be unchanged in 2006.

• Security software has been a long-running priority among the executives on Goldman’s survey panel, and nothing has changed that mind-set based on the current results. Spending on antivirus products has eased up after a flurry of activity, but CIOs continue to focus on improving security in areas like identity management and regulatory compliance, the survey said.

• Other enterprise software priorities include enterprise resource management and customer relationship management systems, with CIOs upgrading those two categories to top priorities. When Goldman polled its panel in April, ERP and CRM software were considered only medium priorities.

• Among enterprise software vendors, VMware Inc. and SAP AG were the two most cited companies receiving a larger percentage of the respondents’ IT budgets. Virtualization technologies are a hot topic this year as Intel Corp. and Advanced Micro Devices Inc. prepare chips that improve the performance of virtualization software. Respondents listed Novell Inc. and Computer Associates International Inc. as receiving less of their IT budgets.

Page 13: Slides on Virtualization and Xen.

ZDNet Blog 11/14/05• “With virtual machines of the desktop sort that VW5 enables, PC

users can literally carve their desktop and notebook systems into completely separate instances of Windows that run side-by-side with each other as though the other instances don't exist.  In other words, if some process in one tries some sort of security exploit like a buffer overflow, it can't get to the others any more than a buffer overflow could affect another computer across the network.  It can only get to whatever is running in that instance or "partition of Windows."  The idea of partitioning systems in this way makes it possible to dedicate partitions to specific activities.  For example, you can do all your Web browsing in one partition while you run your corporate applications in another and your personal applications like Quicken in a third and never the three shall meet.  I'm a Firefox user.  But for those Web sites that require Internet Explorer (which I'm always nervous about using), I just run it in a separate partition.  Using a virtual machine for just one application is like driving on a completely empty road with airbags. “

Page 14: Slides on Virtualization and Xen.

More …

• Intel has announced the arrival of the first desktop chips to include its hardware-based virtualization technology known as VT (codenamed Vanderpool). This could very well signal a new era in desktop/notebook computing and I would think long and hard before buying a new system that doesn't include this new and worthwhile

technology.

Page 15: Slides on Virtualization and Xen.

• So, why is the Intel announcement so significant? Until Intel started releasing its VT technology (it first debuted in the company's recently announced Paxville XEON server chips), companies like SWSoft, VMWare, and Microsoft had to do a lot of the virtual machine heavy lifting in their software.  Without any hardware assistance the likes of which VT provides, it takes far more in the way of physical resources (processor, memory) to launch and run virtual machines than it does if those instantiations can be activated through hardware.  While such technologies make it easier for competing virtual machine software solutions like Xen to get in the virtual machine game, Raghu Raghuram, VMware's senior director of strategy and marketing,  told me earlier this year that his company welcomes innovations like VT because end users will get better performance and his company can focus its attention on adding value in higher layers of the virtualization stack such as management.   VMWare is wasting no time in rolling out its support for Intel's VT technology.  According to a press release on its Web site, VT support is being beta tested in version 5.5 of VMWare Workstation, which the company expects to release by the end of the year.

Page 16: Slides on Virtualization and Xen.

Dianne Greene, President, VMWare

• To start out, why don't you describe what your company does? VMware produces virtualization software. What that means is we take a physical x86-based system and we provide the multiple isolated, movable partitions that you can run operating systems with their applications in. In terms of what the customer gets, they get a way to drive utilization from, say, 15 percent, on up to 85 percent. They get very cost-effective ways to do disaster recovery, high availability, provisioning--all sorts of system-level services.

• Pick a typical customer. What's their life before and after VMware? What changes? A typical customer has got widely proliferated x86 machines, and depending on the power of the server, they can get a 10-to-1, 4-to-1 reduction in the number of servers they need. Or they can stop that proliferation and contain it better. And beforehand, to bring a new service online you have to go order the machine, install it in the server room, get it network-connected, make sure the power is there--it can be a multi-month process. Post-VMware, all they do is keep pre-built images of different software services like SQL Server, and when someone needs that service, they just find some excess capacity somewhere and deploy it.

• So what's the penalty? Why doesn't everybody do this? Actually, what we were finding is that for people who use it, it's become the default way that they run their x86 workloads.

Page 17: Slides on Virtualization and Xen.

OK, I’m convinced

• So what do we do?

• First let’s think about high-level challenges and approaches.

Page 18: Slides on Virtualization and Xen.

High-level Challenges

• VMs must be isolated from each other: it is not acceptable for execution of one to adversely affect performance of the other.– Have to think about what this really means.

• Support variety of OSs.

• Performance overhead introduced by virtualization should be small.

Page 19: Slides on Virtualization and Xen.

Approaches

• Full Virtualization:– Virtual hardware exposed is functionally

identical to the underlying machine. • Allows unmodified operating systems to be hosted. • Seems like this is what VMWare supports.

Page 20: Slides on Virtualization and Xen.

DrawBacks of Full Virtualization

• Especially on x86 architecture:• Support for full virtualization never part of x86 design, e.g.

certain supervisor instructions would need to be handled by the VMM for correct virtualization, but executing with insufficient privilege fails silently as opposed to a nice trap.

• Virtualizating x86 MMU is also a challenge. – VMWare ESX Server dynamically rewrites portions of the

hosted machine code to insert traps wherever VMM intervention might be required. Applied to entire guest OS kernel since all non-trapping privileged sintrsuctions must be caught and handled.

– ESX maintains shadow versions of things like page tables and maintains consistency with the virtual tables by trapping every update attempt – high cost for update-intensive operations such as creating a new application process.

Page 21: Slides on Virtualization and Xen.

More arguments against Full Virtualization

• Sometimes it is desirable for hosted OS to see real as well as virtual resources: – providing both real and virtual time allows a

guest OS to better support time-sensitive tasks and to correctly handle TCP timeouts and RTT estimates

– Exposing real machine addresses allows a guest OS to improve performance by using superpages or page coloring.

Page 22: Slides on Virtualization and Xen.

Xen Approach: Paravirtualization

• Present a virtual machine abstraction that is similar but not identical to the underlying hardware.– Requires modifications to the guest OS.– No changes to the application binary interface

(ABI), so no modifications needed to applications.

Page 23: Slides on Virtualization and Xen.

Xen Design Principles

1. Support for unmodified application binaries is essential.

2. Need to support full multi-application operating systems.

3. Paravirtualization is necessary to obtain high performance and strong resource isolation on uncooperative machine architectures such as x86.

4. Even on cooperative machine architectures, completely hiding the effects of resource virtualization from guest OSes risks both correctness and performance.

Page 24: Slides on Virtualization and Xen.

Terminology

• Guest OS: one of the OSs that Xen can host.

• Domain: running virtual machine within which a guest OS executes.

• Xen itself is called the hypervisor since it operates at a higher privilege level than the supervisor code of the guest operating systems that it hosts.

Page 25: Slides on Virtualization and Xen.

Xen’s Paravirtualized (x86) Interface

• Need to discuss– Memory management– CPU– Device I/O

Page 26: Slides on Virtualization and Xen.

Memory Management

• Hardest part. • Easier if

– the architecture provides a software-managed TLB as these can be easily virtualized.

– Tagged TLB: ability to associate an address-space identifier tag with each TLB entry to allow hypervisor and each guest OS to efficiently coexists in separate address spaces – no need to flush the entire TLB when transferring execution.

Page 27: Slides on Virtualization and Xen.

(What’s a TLB?)

• Short for translation look-aside buffer, a table in the processor’s memory that contains information about the pages in memory the processor has accessed recently. The table cross-references a program’s virtual addresses with the corresponding absolute addresses in physical memory that the program has most recently used. The TLB enables faster computing because it allows the address processing to take place independent of the normal address-translation pipeline.

Page 28: Slides on Virtualization and Xen.

• Unfortunately x86 does ot have a software-managed TLB: TLB misses are serviced automatically by the processor by walking the page table structure in hardware.

• Thus to achieve best possible performance, all valid page translations for the current address space should be present in the hardware-accessible page table.

• Moreover, because the TLB is not tagged, address space switches require a complete TLB flush.

Page 29: Slides on Virtualization and Xen.

• Given these limitations, two decisions:– Guest OS has direct read access to hardware

page tables, but updates are batched and validated by the hypervisor.

– Xen exists in a 64MB section on the top of every address space, thus avoiding a TLB flush when entering and leaving the hypervisor.

Page 30: Slides on Virtualization and Xen.

CPU

• OS no longer most privileged entity in system. Guest OS must run at a lower privilege level than Xen.– X86 has 4 privilege levels, 2 unused, so OK.– Guest OS can’t execute privileged

instructions, but protected from applications at privilege level 3.

– Privileged instructions “paravirtualized” by requiring them to be validated and executed within Xen.

Page 31: Slides on Virtualization and Xen.

CPU

• Exceptions (e.g. memory faults, software traps): Guest OS must register a descriptor table for exception handlers with Xen. – Usually the same as real x86 hardware. Page fault

handler would need to read from a privileged register, so need to work around this.

– Only two types of exceptions frequent enough for real performance hits:

• System Calls: Guest OS may install a “fast” handler for system calls, allowing direct calls from an application into its guest OS and avoiding indirecting through Xen on every call.

• Can’t do with page faults – only code executing in ring 0 can read the faulting address from register CR2.

Page 32: Slides on Virtualization and Xen.

CPU

• Hardware interrupts replaced with a lightweight event system.

• Each guest OS has a timer interface and is aware of both ‘real’ and ‘virtual’ time.

Page 33: Slides on Virtualization and Xen.

Device I/O

• Xen exposes a set of clean and simple device abstractions. – Efficient– Allows protection and isolation

• I/O Data transferred to and from each domain via Xen, using shared-memory asynchronous buffer rings.

• Lightweight event delivery mechanism used for sending asynchronous notifications to a domain.

Page 34: Slides on Virtualization and Xen.

Control and Management• “Separate policy from mechanism”

– Keep hypervisor out of as much as possible.• Hypervisor provides only basic control operations.

– Exported through and interface accessible only from authorized domains.

• Domain is created at boot time which is permitted to use the control interface. This domain (Domain0) responsible for hosting application-level management software.– Control interface allows creation and termination of other domains and

their scheduling parameters, physical memory allocation and access given to machine’s physical disks and network drives.

• Control interface exported to a suite of application-level management software running in Domain0. – Tools allow creation and destruction of domains, set network filters and

routing rules, creation and deletion of virtual network interfaces and virtual block devices.

Page 35: Slides on Virtualization and Xen.
Page 36: Slides on Virtualization and Xen.

Cost of Porting

• Linux: 1.36%.

Page 37: Slides on Virtualization and Xen.

Detailed Design

• Control Transfer

• Data Transfer

• Subsystem Virtualization

Page 38: Slides on Virtualization and Xen.

Control Transfer

• Synchronous calls from a domain to Xen made using a hypercall.– Domain can perform a synchronous software

trap into the hypervisor to do privileged operation.

• Notifications delivered to domains from Xen using asynchronous event mechanism.– Small number of events: new data received,

virtual disk request has been completed.

Page 39: Slides on Virtualization and Xen.

Data Transfer: I/O RingsThe presence of a hypervisor means there is an additional protectiondomain between guest OSes and I/O devices, so it is crucialthat a data transfer mechanism be provided that allows data to movevertically through the system with as little overhead as possible.

Two main factors have shaped the design of I/O-transfermechanism: resource management and event notication. For resourceaccountability, attempt to minimize the work required todemultiplex data to a specific domain when an interrupt is receivedfrom a device . The overhead of managing buffers is carried outlater where computation may be accounted to the appropriate domain.Similarly, memory committed to device I/O is provided bythe relevant domains wherever possible to prevent the crosstalk inherentin shared buffer pools; I/O buffers are protected during datatransfer by pinning the underlying page frames within Xen.

Page 40: Slides on Virtualization and Xen.
Page 41: Slides on Virtualization and Xen.
Page 42: Slides on Virtualization and Xen.

Subsystem: CPU Scheduling

• Uses Borrowed Virtual Time algorithm.– Has low-latency wakeup of a domain when it receives

an event.– Fast dispatch important to minimize effect of

virtualization on OS subsystems that need to run in a timely fashion, e.g. TCP relies on timely delivery of acknowledgements to estimate round-trip times.

– BVT uses virtual-time warping, which temporarily violates ideal “fair sharing” to favor recently-woken domains.

Page 43: Slides on Virtualization and Xen.

Subsystem: Time and Timers

• Xen provides guesOSes with notions of – Real time– Virtual time– Wall-clock time: offset to real time

• Each guest OS can program a pair of alarm timers, one for real time and one for virtual time.

• Timeouts delivered using Xen’s event mechanism.

Page 44: Slides on Virtualization and Xen.

Virtual Address Translation

• Xen tries to virtualize this with as little overhead as possible.– Harder dues to x86’s use of hardware page tables.– VMWare: provide each guest OS with a virtual page table, not

visible to the memory management unit. Hypervisor responsible for trapping accesses to the virtual page table, validating updates, and propagating changes back and forth between it and the MMU-visible “shadow” page table.

• Full virtualization forces use of shadow page tables, Xen is not so constrained

• Xen only involved in page table updates to prevent guest OSes from making unacceptable changes.

• Approach: Register guest OS page tables directly with MMU, and restrict guest OSes to read-only access.

Page 45: Slides on Virtualization and Xen.

Physical Memory

• Initial memory allocation ore reservation for each domain is specified at the time of its creation. Memory statically partitioned between domains, providing strong isolation.

• Maximum-allowable reservation also specified: if memory pressure in a domain increases, it may then attempt to claim additional memory pages from Xen, up to the limit.

• If a domain wants to save resources, can release pages back to Xen.

Page 46: Slides on Virtualization and Xen.

• XenoLinux implements a balloon driver, which adjusts a domain’s memory usage by passing memory pages back and forth between Xen and XenoLinux’s page allocator.

• Could modify Linux MM routines directly, balloon driver makes adjustments by using existing OS functions, thus simplifying Linux porting effort.

• Paravitualization could be used to extend the capabilities of this driver: e.g. out-of-memory handling mechanism in the guest OS can be modified to automatically alleviate memory pressure by requesting more memory from Xen.

Page 47: Slides on Virtualization and Xen.

Network

• Xen provides abstraction of virtual firewall-router where each domain has 1 or more network interfaces.

• Rules for transmit/receive/whatever.

Page 48: Slides on Virtualization and Xen.

Disk• Only Domain0 has direct access to physical disks.• All other domains access disk through abstraction of virtual block devices.• Domain0 manages the VBDs – keeps mechanisms in Xen very simple. • VBD comprises a list of extents with associated ownership and access

control information.• Guest OS disk scheduling algorithm will reorder requests prior to queueing

them on the ring in an attempt to reduce response time or to supply differentiated service.

• Xen has more complete knowledge of actual disk layout, so we support reordering within Xen, and responses may come back our of order.

• Xen services batches of requests from competing domains in a simple round-robin fashion; these are then passed to a standard elevator scheduler before reaching disk hardware. Domains can pass down reorder barriers to prevent reordering.

Page 49: Slides on Virtualization and Xen.
Page 50: Slides on Virtualization and Xen.