Top Banner
System Virtual Machines -Overview Presented by Jongpil Lee
32
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ch8. System VM, VMWare

System Virtual Machines-Overview

Presented by

Jongpil Lee

Page 2: ch8. System VM, VMWare

Contents Key Concepts Resource Virtualization – Processors Resource Virtualization – Memory Resource Virtualization – Input/Output

Page 3: ch8. System VM, VMWare

System Virtual Machines A system VM environment is capable of supporting multiple system imag

es simultaneously, each running its own operating system and associated application programs

Real resources of the host platform are shared among the guest system with the virtual machine monitor(VMM) The VMM manages the allocation of , and access to, the hardware resource

of host platform

LinuxApplications

Linux

VirtualIntel IA-32

WindowsApplications

Windows

VirtualIntel IA-32

SolarisApplications

Solaris

VirtualIntel IA-32

Virtual Machine Monitor(VMM)

Intel IA-32 Hardware

Page 4: ch8. System VM, VMWare

Key Concepts(1) Outward Appearance

CPU

TerminalController

Disk Memory NetworkController

Printer

To network

Dedicated touser1

Dedicated touser2

speaker Keyboard CD-Drive

Display Mouse

speaker Keyboard CD-Drive

Display Mouse

SharedHardware

Page 5: ch8. System VM, VMWare

Key Concepts(2) State Management

Processor

Register BlockPointer

VMM Memory

Register valuesFor VM2

Register valuesFor VM1

Register valuesFor VM3

Load register block pointerTo point to VM’s registersIn VMM memory

Load program counter toPoint to VM program andStart execution

.

.Load temp <- reg_pointer, index(A)Store reg_pointer, index(B) <- temp

.

.

Copy register state from VMM memory

Load program counter toPoint to VM program andStart execution

.

.Mov reg A -> reg B

.

.Copy register state from Processor back to system memory

Processor

VMM Memory

Register valuesFor VM2

Register valuesFor VM1

Register valuesFor VM3

VMM copies registervalues when VM is

activatedProcessorRegister

VMM changespointer when

VM is activated

Indirection

Copying

Page 6: ch8. System VM, VMWare

Key Concepts(3) Resource Control The VMM maintain overall control of all the hardware resources Interval timer interrupt

Instead of allowing the operating system in a virtual machine to field the timer interrupt, the VMM first handles the interrupt itself

First VM Active Next VM ActiveNext VM Active

TimerInterruptoccurs

VMM savesarchitected stateOf running VM

VMM determinesNext VM to be

activated

VMM sets timerinterval and

enables interrupts

VMM restoresarchitected state

For next VM

VMM sets PC to timerInterrupt handler of

OS in next VM

Page 7: ch8. System VM, VMWare

Key Concepts(4) Native and Hosted Virtual Machine A native VM system

The VMM opeartes in a privilege mode higher than the mode of the guest virtual machines

The privilege level of the guest OS is emulated by the VMM A Hosted VM system

a virtual machine system is installed on a host platform that is already running an existing OS

The VMM utilizes the functions already available on the host OS to control and manage resources desired by each of the virtual machine

TraditionalUniprocessor

system

OS

Hardware

Application

NativeVM system

VMM

Hardware

Guest OS

Guest Apps

User-Modehosted

VM system

Host OS

Hardware

VMM

Guest Apps

Guest OS

VMM

Dual-modehosted

VM system

Host OS

Hardware

Guest OS

Guest Apps

Nonprivilegedmodes

Privilegedmodes

Page 8: ch8. System VM, VMWare

Key Concepts(5)IBM VM/370 The virtual machine monitor of VM/370

the control program(CP) A single-user operating system

The conversational monitor system(CMS)

Page 9: ch8. System VM, VMWare

Resource Virtualization - Processor The key aspect of virtualizing a processor

the execution of the guest instructions, including both system-level and user-level instruction

Processor virtualization method Emulation

Interpretation, binary translation ( described in Chapter 2 ) Direct native execution

Only if the ISA of the host is identical to the ISA of the guest

Trap For virtualizable ISA, a trap occurs naturally when an instruction needs to be

emulated the trap handler jumps to an appropriate interpreter routine, interprets the single in

struction, and returns control back to the original program

Page 10: ch8. System VM, VMWare

Resource Virtualization - Processor Conditions for ISA Virtualizability(1) We restrict the discussion here to native system VMs In a native system VM, the VMM runs in system mode, and all other soft

ware runs in user mode The VMM keeps track of the intended mode of operation of a guest virtu

al machine But The VMM sets the actual native hardware mode to user mode whenever

executing instructions from the guest virtual machine

Page 11: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(2) The machine being virtualized is modeled as a 4-tuple

S = < E, M, P, R > E : the executable storage M : the mode of operation P : the program counter R : the memory relocation bounds register

A memory trap occurs if the address accessed by a program falls outside the bounds indicated by R

A privileged instruction is defined as one that traps if the machine is in user mode and does not trap if the machine is in system mode Load PSW(LPSW, IBM System/370)

Load the processor status word (PSW) from a location in memory if the processor is in system mode. If it is not in system mode, the machine traps

Set CPU Timer(SPT, IBM System/370) Replaces the CPU interval timer with the contents of a location in memory if the C

PU is in system mode and traps if it is not

Page 12: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(3) To specify instructions that interact with hardware, two categories of spe

cial instructions are defined Control-sensitive instruction

Attempt to change the configuration of resources in the system Ex) Load PSW, Set CPU Timer

Behavior-sensitive instruction Behavior or results produced depend on the configuration of resource Ex) Load Real Address(LRA)

takes a virtual address, translates it, saves the corresponding real address in a specified general-purpose register

The behavior of this instruction depends on the state(mapping) of the real memory resource

Ex) Pop Stack into Flags Register(POPF) pops the flag registers from a stack held in memory In user mode, this instrution overwrites all flags except the interrupt-enable flag For the interrupt-enable flag, the instruction acts as a no-op when executed in user mode

Innocuous instruction

Page 13: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(4)

dispatcher

Allocator

Interpreterroutine1

Interpreterroutine1

Interpreterroutine1

InstructionTrap occurs

These instructions desire to change

machine reosurce,e.g., load relocation

bounds register

These instructions do notchange machine resources

But access privilegedresource, e.g., IN, OUT,

Write TLB

Privilegedinstruction

Privilegedinstruction

Privilegedinstruction

Privilegedinstruction

Component of a Virtual Machine Monitor1. Dispatcher2. Allocator3. Interpreter routines

Page 14: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(5) The theorem regarding (efficient) VMM construction

Theorem 1 A virtual machine monitor may be constructed if the set of sensitive instruction is a

subset of the set of privileged instructions An efficient virtual machine implementation can be constructed if instructions that

could interfere with the functioning of the VMM always trap in the user mode

Page 15: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(6) The VMM interprets a sensitive instruction according to the prevailing

status of the virtual system resources and the state of the virtual machine

Guset OS code in VM(user mode)

VMM code(privileged mode)

Privileged instruction(LPSW)

Next instruction(target of LPSW)

Dispatcher

LPSW Routine:Change mode to priilegedCheck privilege level in VMEmulate instructionCompute targetRestore mode to userJump to target

Page 16: ch8. System VM, VMWare

Resource Virtualization – Processor Conditions for ISA Virtualizability(7) Interpreting the SPT interuction

The VMM examines the contents of the location to be loaded into the CPU timer If( t < T ) t is loaded, else T is loaded

t : the content of the location T : the time remaining from the allocated time for the virtual machine itself

Meanwhile, it keeps the time difference( t - T ) in an internal table so that this time can be restored when the guest VM is again activated

Page 17: ch8. System VM, VMWare

Resource Virtualization – Processor Recursive Virtualization The concept of running the virtual machine system on a copy itself Two effects that usually restrict the ability to create an efficient recursivel

y virtualizable system Theorem 2

A conventional third-generation computer is recursively virtualizable if (a) it is virtualizable and (b) a VMM without any timing dependences can be constructed for it

hardware

VMM

VirtualMachine

2nd-level VMM

VirtualMachine

VirtualMachine

VirtualMachine

PrivilegedMode

NonprivilegedMode

Page 18: ch8. System VM, VMWare

Resource Virtualization – Processor Handling Problem Instructions The POPF instruction is sensitive but not privileged

Critical instruction ( sensitive but not privileged ) It does not generate a trap in user mode It violate the virtualizability condition of Theorem 1

An additional set of steps must be taken in order to implement a system virtual machine( with possible loss of some efficiency ) It is possible for a VMM intercepts POPF and other critical instructions if all g

uest software were interpreted instruction by instruction Techniques related to those described in Chapters 2 and 3 can be used to re

duce the inefficiency

Page 19: ch8. System VM, VMWare

Resource Virtualization – Processor Handling Problem Instructions

Scanner andPatcher

Code patch forDiscovered

Critical instruction

Control transfer,e.g., trap

VMM

Page 20: ch8. System VM, VMWare

Resource Virtualization – Processor Patching of Critical Instructions One way to discover critical instructions

The VMM takes control at the head of each guest basic block and scan instructions in sequence until the end of the basic block is reached If a critical instruction is found, it is replaced with a trap to the VMM Another trap back to the VMM is placed at the end of the basic block

To reduce overhead, the trap at the end of a scanned basic block can be replaced by the original branch or jump instruction

Page 21: ch8. System VM, VMWare

Resource Virtualization – Processor Caching Emulation Code The overhead of VMM interpretation can become a problem when the

frequency of sensitive instructions requiring interpretation is high

TranslationTable

Block 1

Block 2

Block 3

Code Cache

SpecializedEmualtion Routines

Code sectionEmulated in code

cache

Two criticalInstructions combined

Into a single block

Block 1

Block 3

Block 2

ControlTransfer,e.g., trap

Patched Program VMM

Page 22: ch8. System VM, VMWare

Resource Virtualization – MemoryVirtual Memory Support in a System Virtual machine Environment(1)

Each of the guest VMs has its own set of virtual memory tables Address translation in each of the guest VMs transforms address in its vi

rtual address space to locations in real memory Real memory : a guest VM’s illusion of physical memory Physical memory : the hardware memory

A guest’s real memory address must undergo a further mapping to determine the address in physical memory of the host hardware

VMM maintains a real map table mapping the real pages to physical pages

Page 23: ch8. System VM, VMWare

Resource Virtualization – MemoryVirtual Memory Support in a System Virtual machine Environment(2)

Virtual memory of Program 1 on VM1

1000

2000

1500

3000

5000

1000

4000

500

3000

1000

4000

500

1000

3000

Real Memory ofVM1

Virtual memory of Program 2 on VM1

Real Memory ofVM2

Virtual memory of Program 3 on VM2

Not mappedto physical

memory

Physical Memoryof System

Virtual pageReal page--- ---

1000 Not mapped--- ---

4000 3000--- ---

Page Table for Program 2

Virtual pageReal page--- ---

1000 500--- ---

4000 3000--- ---

Page Table for Program 3

Virtual pageReal page--- ---

1000 5000--- ---

2000 1500--- ---

Page Table for Program 1VM1 Real

pageReal page

--- ---3000 Not mapped

--- ---5000 1000

--- ---Real Map Table for VM1

--- ---1500 500

VM2 Realpage

Real page

--- ---3000 Not mapped

--- ---Real Map Table for VM2

--- ---500 3000

Not mapped

Page 24: ch8. System VM, VMWare

Resource Virtualization – MemoryVirtualizing Architected Page Tables(1) The virtual-to-physical mapping is kept by the VMM in shadow page tabl

es, one for each of the guest VMs These tables are the ones actually used by hardware to translate virtual addr

esses and to keep the TLB up-to-date To make this method work, the page table pointer register is virtualized

Virtual page Physical page--- ---

1000 1000--- ---

2000 500--- ---

Shadow Page Table forProgram 1 on VM1

Virtual page Physical page--- ---

1000 1000--- ---

2000 500--- ---

Shadow Page Table forProgram 1 on VM1

Virtual page Physical page--- ---

1000 1000--- ---

2000 500--- ---

Shadow Page Table forProgram 1 on VM1

Page table pointerProgram 1 on VM1 is

Currently active

Page 25: ch8. System VM, VMWare

Resource Virtualization – MemoryVirtualizing Architected Page Tables(2) Page fault handling

If the page is mapped in the virtual table of the guest OS The VMM has moved the accessed real page to its own swap space The VMM brings the real page back into physical memory The VMM updates the real map table and the affected shadow table(s)

If the page is not mapped in the guest The VMM transfers control to the trap handler of the guest, indicating a page falut The guest OS then issues instruction to modify its page table The VMM intercepts these request The VMM updates the page table and also updates the mapping in the appropriat

e shadow page table

Page 26: ch8. System VM, VMWare

Resource Virtualization – Memory Virtualizing an Architected TLB To virtualize the TLB, the VMM maintains a copy of each guest’s TLB co

ntents and also manages the real TLB The real TLB management

The VMM rewrite the TLB whenever a guest VM is activated The VMM translates the real address in virtual TLB to physical address in the phy

sical TLB The VMM copies the VM’s virtual TLB entries into the physical TLB A fairly high overhead

The VMM leverage the address space identifiers(ASIDs)

Virtual TLB of VM1

Virtualpage

Realpage

--- ---2000 1500

--- ---4000 3000

--- ---

--- ---1000 5000

ASID

---3---3---7---

Virtual TLB of VM2

Virtualpage

Realpage

--- ------ ------ ------ ------ ---

--- ---1000 3000

ASID

---3---------------

ASID Mapping:Prog. 1 – ASID 3Prog. 2 –ASID 7

ASID Mapping:Prog. 1 – ASID 3

Virtual TLBs ASID Map Table Real TLB

Virtual ASID

RealASID

--- ---VM1:3 9

--- ---VM1:7 ---

--- ---VM2:3 4

Virtualpage

Realpage

--- ---1000 3000

--- ---2000 500

--- ---

--- ---1000 1000

ASID

---9---4---9---

Page 27: ch8. System VM, VMWare

Resource Virtualization – Input/OutputVirtualizing Device Dedicated Devices

Some I/O device is dedicated to a particular guest VM or at least are switched from one guest to another on a very long time scale

The device itself does not necessarily have to be virtualized Requests to and from the device could theoretically bypass the VMM and go

directly to the guest operating system

Partitioned Device A very large disk, for example, can be partitioned into several smaller virtual

disk that are then made available to the virtual machine as dedicated devices

Page 28: ch8. System VM, VMWare

Resource Virtualization – Input/OutputVirtualizing Device Shared Devices

Some device, such as a network adapter, can be shared among a number of guest VMs at a fine time granularity

Each guest may have its own virtual state related to usage of the device, e.g., a virtual network address. This state information is maintained by the VMM for each guest VM

Nonexistent Physical Device Virtual devices “attached” to a virtual machine for which there is no correspo

nding physical device For example, a network adapter that is used for communicating with other vir

tual machines on the same platform

Page 29: ch8. System VM, VMWare

Resource Virtualization – Input/OutputVirtualizing Device Spooled Device

Virtualization of spooled device can be performed by using a two-level spool table approach

Virtual Machine 1 Spool Table

ProgramABCD

StatusPrinted

CompletedRunning

Completed

Location1000200030004000

Real loc11000120001300014000

Size400200200500

Virtual Machine 2 Spool Table

Size400800

Real loc2100022000

Location10002000

StatusRunning

Completed

ProgramPQ

VMM Spool Table

VM1211

StatusAQBD

StatusPrintedPrintingWaitingWaiting

Real loc30000310003180030400

Size400800200500

10000

20000

30000

Page 30: ch8. System VM, VMWare

Resource Virtualization – Input/OutputVirtualizing I/O Activity

Application

Hardware

Operating system

VMM I/O drivers

System calls

Physical memory and I/O operations

driver calls

An application program makes device-independent I/O request

The Operating system converts the device-independent request into calls to device driver routines

A device driver takes care of device-specific aspects of performing an I/O transaction

The VMM can intercept a guest’s I/O action and convert it from a virtual device action to a real device action at any of the three interface The system call interface The device driver interface The operational-level interface

Page 31: ch8. System VM, VMWare

Resource Virtualization – Input/OutputVirtualizing I/O Activity Virtualizing at the I/O operation Level

The privileged nature of the I/O operations make them easy for the VMM to intercept because they trap in user mode

Virtualizing at the Device Driver Level If the VMM can intercept the call to the virtual device driver, it can convert th

e virtual device information to the corresponding physical device and redirect the call to a driver program for the physical device

It requires that the VMM developer have some knowledge of the guest operating system and its internal device driver interfaces

Virtualizing at the System call Level The virtualization process could be made more efficient by intercepting the in

itial I/O request at the OS interface, the ABI The entire I/O action could be done by the VMM

Page 32: ch8. System VM, VMWare

Resource Virtualization – Input/Output Input/Output Virtualization and Hosted Virtual Machine

An I/O request from a guest virtual machine is converted by the native-mode portion of the VMM into a user application request made to the host

An advantage of a hosted virtual machine It is not necessary to provide device drivers in the VMM the actual device drivers do not have to be incorporated as part of the VMM

A component that form a dual mode hosted virtual machine system VMM-n(native)

Intercepts traps due to privileged instructions or patched critical instructions encountered in a virtual machine

VMM-u(user) Makes resource requests to the host OS

VMM-d(driver) Provide a means for communication between the other two components