Top Banner
Chapter 8 Chapter 8 System Virtual Machines System Virtual Machines 2005.11.9 2005.11.9 Dong In Shin Dong In Shin Distributed Computing System Laboratory Distributed Computing System Laboratory Seoul National Univ. Seoul National Univ. System VMs
46
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SystemVM2.ppt

Chapter 8Chapter 8System Virtual Machines System Virtual Machines

2005.11.92005.11.9

Dong In ShinDong In Shin

Distributed Computing System LaboratoryDistributed Computing System Laboratory

Seoul National Univ. Seoul National Univ.

System VMs

Page 2: SystemVM2.ppt

System VMs

Contents

Performance Enhancement of System VMs1

Case Study : Vmware Virtual Platform2

Case Study : The Intel VT-x Technology3

** Case Study : Xen 4

Page 3: SystemVM2.ppt

Performance Enhancement Performance Enhancement of System Virtual Machinesof System Virtual Machines

Page 4: SystemVM2.ppt

System VMs

Reasons for Performance Degradation

SetupEmulation

Some guest instructions need to be emulated (usually via interpretation) by the VMM.

Interrupt handlingState savingBookkeeping

Ex. The accounting of time charged to a userTime elongation

Page 5: SystemVM2.ppt

System VMs

Instruction Emulation Assists

The VMM emulates the privilege instruction using a routine whose operation depends on whether the virtual machine is supposed to be executing in system mode or in user mode. Hardware assist for checking the state and performing

the actions.

Page 6: SystemVM2.ppt

System VMs

Virtual Machine Monitor Assists

Context switch Using hardware to save and restore registers

Decoding of privileged instructions Hardware assists, such as decoding the privileged

instructions. Virtual interval timer

Decrementing the virtual counter by some amount estimated by the VMM from the amount that the real timer decrements.

Adding to the instruction set A number of new instructions that are not a part of the

ISA of the machine.

Page 7: SystemVM2.ppt

System VMs

Improving Performance of the Guest System

Non-paged mode The guest OS disables dynamic address

translation and defines its real address space to be as large as the largest virtual address space. Page frames are mapped to fixed real pages.

The guest OS no longer has to exercise demand paging.

No double pagingdouble paging No potential conflict in paging decisions by

the guest OS system and the VMM

Page 8: SystemVM2.ppt

System VMs

Double Paging

Two independent layers of paging will interact, perform poorly.

Guest OS incorrectly believe a page to be in physical memory ( green/gold pages )

VMM believes an unneeded page is still in use (teal pages)Guest evicts a page despite available physical memory (red pages)

Page 9: SystemVM2.ppt

System VMs

Pseudo-page-fault handling

A page fault in a VM system A page fault in some VM’s page table A page fault of VMM’s page table

• Pseudo page-fault handling Process

Initialize page-in operation from backing store. Triggers guest ‘pseudo page fault’. Guest OS suspends guest’s user process. VMM does not suspend the guest.

On completion of page-in operation VMM calls guest pseudo page fault handler again Guest OS handler wakes up blocked user process.

Page 10: SystemVM2.ppt

System VMs

The others…

Spool files Without any special mechanism, VMM should

intercept the I/O commands and decipher that the virtual machines are simultaneously attempting to send a job to the I/O devices .

Handshaking allows the VMM picks up the spool file and continues to merge this file into its own buffer.

Inter-virtual-machine communication Communication between two physical machines

involves the processing of message packets through several layers at the sender/receiver side

This process can be streamlines, simplified, and made faster if the two machines are virtual machines on the same host platform.

Page 11: SystemVM2.ppt

System VMs

Specialized Systems

Virtual-equals-real (V=R) virtual machine The host address space representing the guest real memory is

mapped one-to-one to the host real memory address space. Shadow-table bypass assist

The guest page tables can point directly to physical addresses if the dynamic address translation hardware is allowed to manipulate the guest page tables.

Preferred-machine assist Allow a guest OS system to operate in system mode rather than

user mode. Segment sharing

Sharing the code segments of the operating system among the virtual machines, provided the operating system code is written in a reentrance manner.

Page 12: SystemVM2.ppt

System VMs

Generalized Support for Virtual Machines

Interpretive Execution Facility (IEF) The processor directly executes most of the functions

of the virtual machine in hardware. An extreme case of a VM assist.

Interpretive Execution Entry and Exit Entry

• Start Interpretive Execution (SIE) : The software give up control to the hardware IEF part and processor enters the interpretive execution mode.

Exit • Host Interrupt • Interception

– Unsupported hardware instructions.– Exception during the execution of interpreted instruction. – Some special case…

Page 13: SystemVM2.ppt

System VMs

Interpretive Execution Entry and Exit

VMM Software

SIE

Host interrupthandler

Interpretiveexecution

mode

Entry into interpretive execution mode

Exit for interception

Exit for host interrupt

Emulation

Page 14: SystemVM2.ppt

System VMs

Full-virtualization Versus Para-virtualization

Full virtualization Provide total abstraction of the underlying physical system and

creates a complete virtual systems in which the guest operating systems can execute.

No modification is required in the guest OS or application. The guest OS or application is not aware of the virtualized

environment. Advantages

Streamlining the migration of applications and workloads between different physical systems.

Complete isolation of different applications, which make this approach highly secure.

Disadvantages Performance penalty

Microsoft Virtual Server and Vmware ESX Server

Page 15: SystemVM2.ppt

System VMs

Full-virtualization Versus Para-virtualization

Para Virtualization The virtualization technique that presents a software interface to

virtual machines that is similar but not identical to that of the underlying hardware.

This techniques require modifications to the guest OS that are running on the VMs.

The guest OSs are aware that they are executing on a VM. Advantages

Near-native performance Disadvantages

Some limitations, including several insecurities such as the guest OS cache data, unauthenticated connections, and so forth.

Xen system

Page 16: SystemVM2.ppt

Case Study:Case Study:Vmware Virtual PlatformVmware Virtual Platform

Page 17: SystemVM2.ppt

System VMs

Vmware Virtual Platform

A popular virtual machine infrastructure for IA-32-based PCs and server.

An example of a hosted virtual machine system Native virtualization architecture product Vmware

ESX Server This book is limited to the hosted system, Vmware

GSX Server (VMWare2001) Challenges

Difficulties to virtualize efficiently based on IA-32 environment.

The openness of the system architecture. Easy Installation.

Page 18: SystemVM2.ppt

System VMs

Vmware’s Hosted Virtual Machine Model

Page 19: SystemVM2.ppt

System VMs

Processor Virtualization

Critical Instructions in Intel IA-32 architecture not efficiently virtualizable.

Protection system references Reference the storage protection system, memory system, or

address relocation system. (ex. mov ax, cs ) Sensitive register instructions

Read or change resource-related registers and memory locations (ex. POPF)

Problems The sensitive instructions executed in user mode do not

executed as correct as we expected unless the instruction is emulated.

Solutions The VM monitor substitutes the instruction with another set of

instruction and emulates the action of the original code.

Page 20: SystemVM2.ppt

System VMs

Input/Output Virtualization

The PC platform supports many more devices and types of devices than any other platform.

Emulation in VMMonitor Converting the in and out I/O to new I/O instructions. Requires some knowledge of the device interfaces.

New Capability for Devices Through Abstraction Layer VMApp’s ability to insert a layer of abstraction above

the physical device. Advantages

Reduce performance losses due to virtualization.• Ex) Virtual Ethernet switch between a virtual NIC and a

physical NIC.

Page 21: SystemVM2.ppt

System VMs

Using the Services of the Host Operating System

The request is converted into a host OS call.

Advantages No limitations for VMM’s access of the host

OS’s I/O features. Running the Performance-Critical

applications

Page 22: SystemVM2.ppt

System VMs

Memory Virtualization

Paging requests of the guest OS Not directly intercepted by the VMM, but

converted into disk read/writes. VMMonitor translates it to requests on the

host OS throught VMApp. Page replacement policy of host OS

The host could replace the critical pages of VM system in the competition with other host applications.

VMDriver’s critical pages pinning for virtual memory system.

Page 23: SystemVM2.ppt

System VMs

Vmware ESX Server

Native VM A thin software layer designed to multiplex

hardware resources among virtual machines Providing higher I/O performance and

complete control over resource managementFull Virtualization

For servers running multiple instances of unmodified operating systems

Page 24: SystemVM2.ppt

System VMs

Page Replacement Issues

Problem of double paging Unintended interactions with native memory

management policies between in guest operating systems and host system.

Ballooning Reclaims the pages considered least valuable by the

operating system running in a virtual machine. Small balloon module loaded into the guest OS as a

pseudo-device driver or kernel service. Module communicates with ESX server via a private

channel.

Page 25: SystemVM2.ppt

System VMs

Ballooning in VMware ESX Server

Inflating a balloon When the server wants to

reclaim memory Driver allocate pinned

physical pages within the VM

Increase memory pressure in the guest OS, reclaim space to satisfy the driver allocation request

Driver communicates the physical page number for each allocated page to ESX server

Deflating Frees up memory for

general use within the guest OS

Page 26: SystemVM2.ppt

System VMs

Virtualizing I/O Devices on VMware Workstation

Supported virtual devices of VMware PS/2 keyboard, PS/2 mouse, floppy drive, IDE controllers with

ATA disks and ATAPI CD-ROMs, a Soundblaster 16 sound card, serial and parallel ports, virtual BusLogic SCSI controllers, AMD PCNet Ethernet adapters, and an SVGA video controller.

Procedures Intercept I/O operations issued by the guest OS. ( IA-32 IN and

OUT ) Emulated either in the VMM or the VMApp.

Drawbacks Virtualizing I/O devices can incur overhead from world switches

between the VMM and the host Handling the privileged instructions used to communicate with

the hardware

Page 27: SystemVM2.ppt

Case Study:Case Study:The Intel VT-x (Vanderpool) The Intel VT-x (Vanderpool) TechnologyTechnology

Page 28: SystemVM2.ppt

System VMs

Overview

VT-x (Vanderpool) technology for IA-32 processors enhance the performance VM implementation through

hardware enhancements of the processor. Main Feature

The inclusion of the new VMX mode of operation (VMX root/non-root operation)

VMX root operation• Fully privileged, intended for VM monitor New instructions –

VMX instructions VMX non-root operation

• Not fully privileged, intended for guest software• Reduces Guest SW privilege w/o relying on rings

Page 29: SystemVM2.ppt

System VMs

Technological Overview

Root Mode(VMM)

Non-Root(VM1)

Non-Root(VM2)

RegularMode

RegularMode

vmxonvmlaunch

VM1vmlaunch

VM2vmresume

VM2vmresume

VM2vmresume

VM1vmxoff

VM1exits

VM2exits

VM2exits

VM2exits

VM1exits

Page 30: SystemVM2.ppt

System VMs

IA-32Operation

VT-x Operations

Ring 0

Ring 3VMX RootOperation

VMX Non-rootOperation

. . .Ring 0

Ring 3

VM 1

Ring 0

Ring 3

VM 2

Ring 0

Ring 3

VM n

VMXONVMLAUNCHVMRESUME

VM Exit VMCS2

VMCSn

VMCS1

Page 31: SystemVM2.ppt

System VMs

Capabilities of the Technology

A Key aspect The elimination of the need to run all guest code in

the user mode. Maintenance of state information

Major source of overhead in a software-based solution

Hardware technique that allows all of the state-holding data elements to be mapped to their native structures.

VMCS (Virtual Machine Control Structure)• Hardware implementation take over the tasks of loading and

unloading the state from their physical locations.

Page 32: SystemVM2.ppt

System VMs

Virtual Machine Control Structure (VMCS)

Control Structures in Memory Only one VMCS active per virtual processor at any

given time VMCS Payload

VM execution, VM exit, and VM entry controls Guest and host state VM-exit information fields

Page 33: SystemVM2.ppt

** Case Study:** Case Study:Xen VirtualizationXen Virtualization

Page 34: SystemVM2.ppt

System VMs

Xen Design Principle

Support for unmodified application binaries is essential.

Supporting full multi-application operating system is important.

Paravirtualization is necessary to obtain high performance and strong resource isolation.

Page 35: SystemVM2.ppt

System VMs

Xen Features

Secure isolation between VMsResource Control and QoSOnly guest kernel needs to be ported

All user-level apps and libraries run unmodified.

Linux 2.4/2.6 , NetBSD, FreeBSD, WinXPExecution performance is close to

native. Live Migration of VMs between Xen

nodes.

Page 36: SystemVM2.ppt

System VMs

Xen 3.0 Architecture

Page 37: SystemVM2.ppt

System VMs

Xen para-virtualization

Arch Xen/X86 , replace privileged instructions with Xen hypercalls.

Hypercalls Notifications are delivered to domains from Xen using

an asynchronous event mechanism Modify OS to understand virtualized

environment Wall-clock time vs. virtual processor time

• Xen provides both types of alarm timer Expose real resource availability

Xen Hypervisor Additional protection domain between guest OSes

and I/O devices.

Page 38: SystemVM2.ppt

System VMs

X86 Processor Virtualization

Xen runs in ring 0 (most privileged) Ring 1,2 for guest OS, 3 for user-space Xen lives in top of 64MB of linear address

space. Segmentation used to protect Xen as switching page

tables too slow on standard X86 Hypercalls jump to Xen in ring 0 Guest OS may install ‘fast trap’ handler MMU-virtualization : shadow vs. direct-

mode

Page 39: SystemVM2.ppt

System VMs

Para-virtualizing the MMU

Guest OS allocate and manage own page-tables Hypercalls to change PageTable base.

Xen Hypervisor is responsible for trapping accesses to the virtual page table, validating updates and propagating changes.

Xen must validate page table updates before use Updates may be queued and batch processed

Validation rules applied to each PTE Guest may only map pages it owns

XenoLinux implements a balloon driver Adjust a domain’s memory usage by passing memory

pages back and forth between Xen and XenoLinux

Page 40: SystemVM2.ppt

System VMs

MMU virtualization

Page 41: SystemVM2.ppt

System VMs

Writable Page Tables

Page 42: SystemVM2.ppt

System VMs

I/O Architecture

Asynchronous buffer descriptor rings Using shared-memory

Xen I/O-Spaces delegate guest Oses protected access to specified h/w devices

The guest OS passes buffer information vertically through the system.

Xen performs validation checks. Xen supports a lightweight event-delivery

mechanism which is userd for sending asynchronous notifications to a domain.

Page 43: SystemVM2.ppt

System VMs

Data Transfer : I/O Descriptor Rings

Page 44: SystemVM2.ppt

System VMs

Device Channel Interface

Page 45: SystemVM2.ppt

System VMs

Performance

Page 46: SystemVM2.ppt