8/8/2019 Virtual Machines Dinda0803
1/49
Virtuoso: Distributed ComputingUsing Virtual Machines
Peter A. DindaPrescience Lab
Department of Computer Science
Northwestern University
http://plab.cs.northwestern.edu
8/8/2019 Virtual Machines Dinda0803
2/49
2
People and Acknowledgements
Students Ashish Gupta, Ananth Sundararaj,
Dong Lu, Bin Lin, Jason Skicewicz, BillyDavidson, Andrew Weinrich, Jack Lange, Alex
Shoykhet
Collaborators
In-Vigo project at University of Florida Renato Figueiredo, Jose Fortes
http://invigo.acis.ufl.edu
Funder
NSF through several awards
8/8/2019 Virtual Machines Dinda0803
3/49
3
Outline
Motivation
Virtuoso Model
Virtual networking and remote devices
Information services
Resource measurement and prediction
Resource control
Related work
ConclusionsR. Figueiredo, P. Dinda, J. Fortes,
A Case For Grid Computing on
Virtual Machines, ICDCS 2003
8/8/2019 Virtual Machines Dinda0803
4/49
4
How do we deliver arbitrary amounts of
computational power to ordinary people?
8/8/2019 Virtual Machines Dinda0803
5/49
5
How do we deliver arbitrary amounts ofcomputational power to ordinary people?
Distributed and
Parallel Computing
InteractiveApplications
8/8/2019 Virtual Machines Dinda0803
6/49
6
How do we deliver arbitrary amounts ofcomputational power to ordinary people?
Distributed and
Parallel Computing
InteractiveApplications
8/8/2019 Virtual Machines Dinda0803
7/49
7
IBM xSeries
virtual cluster
(64 CPUs),
1 TB RAID
Northwestern
Internet
Su E terprise servers
(E450, E250; 6 CPUs)
Developme t cluster(5 PowerEdge, 10 CPUs)
IBM xSeries
Virtual cluster(64 CPUs)
G E switch
IBM zSeries mai frame
(1-way, 3.36TB storage)
RAID array(1.2TB)
10/100
switch
IBM xSeries
Dev. cluster(8 CPUs)
Interactivity
Environment
Cluster, CAVE
(~90 CPUs),
8 TB RAID
2 Distri uted
Optical Test ed
Clusters
IBM xSeries
(14-28 CPUs),
1 TB RAID
Nortel Optera
Metro Edge
Optical RouterDistributed Optical Testbed
(DOT) Private Optical Network
DOT clusters
with optical
connectivity
IBM xSeries
(14-28 CPUs),
1 TB RAID:
Argonne, U.Chicago,
II
T,N
CSA, others
8/8/2019 Virtual Machines Dinda0803
8/49
8
Grid Computing
Flexible, secure, coordinatedresource sharing among dynamic
collections of individuals,
institutions, and resources I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the
Grid: Enabling Scalable Virtual Organizations,
International J. Supercomputer Applications, 15(3), 2001
Globus, Condor/G, Avaki, EU DataGrid SW,
8/8/2019 Virtual Machines Dinda0803
9/49
9
Complexity from Users Perspective
Process or job model Lots of complex state: connections, special
shared libraries, licenses, file descriptors
Operating system specificity Perhaps even version-specific
Symbolic supercomputer example
Need to buy into some Grid API Install and learn complex Grid software
8/8/2019 Virtual Machines Dinda0803
10/49
10
Users already know how to
deal with this complexity at
another level
8/8/2019 Virtual Machines Dinda0803
11/49
11
Complexity from Resource Owners Perspective
Install and learn complex Grid software
Deal with local accounts and privileges
Associated with global accounts orcertificates
Protection
Support users with different OS, library,
license, etc, needs.
8/8/2019 Virtual Machines Dinda0803
12/49
12
Virtual Machines
Language-oriented VMs Abstract interpreted machine, JIT Compiler, large library
Examples: UCSD p-system, Java VM, .NET VM
Application-oriented VMs Redirect library calls to appropriate place
Examples: Entropia VM Virtual servers Kernel makes it appear that a group of processes are running on a
separate instance of the kernel
Examples: Ensim, Virtuozzo, SODA,
Virtual machine monitors (VMMs) Raw machine is the abstraction VM represented by a single image
Examples: IBMs VM, VMWare, Virtual PC/Server, Plex/86, SIMICS,Hypervisor, DesQView/TaskView. VM/386
8/8/2019 Virtual Machines Dinda0803
13/49
13
VMWare GSX VM
8/8/2019 Virtual Machines Dinda0803
14/49
14
Isnt It Going to Be Too Slow?
A licati n Res urce xecTi e
(1 ^ s)
Over ea
SpecHPC
Seismic
(serial,
medium)
Physical 16.4 N/A
VM, local 16.6 1.2%
VM, Grid
virtual FS
16.8 2.0%
SpecHPC
Climate
(serial,medium)
Physical 9.31 N/A
VM, local 9.68 4.0%
VM, Grid
virtual FS
9.70 4.2%
Experimental setup: physical: dual Pentium III 933MHz, 512MB memory, RedHat 7.1,
30GB disk; virtual: Vmware Workstation 3.0a, 128MB memory, 2GB virtual disk, RedHat 2.0
NFS-based grid virtual file system between UFL (client) and NWU (server)
Small relative
virtualization
overhead;
compute-intensive
8/8/2019 Virtual Machines Dinda0803
15/49
15
Isnt It Going To Be Too Slow?
0
0.5
1
1.5
2
2.5
3
0
0.5
1
1.5
2
2.5
3
No Load Light Load Heavy Load
Tasks onPhysicalMachine
Tasks onVirtual
Machine
Tasks onPhysicalMachine
Tasks onVirtual
Machine
Tasks onPhysicalMachine
Tasks onVirtual
Machine
Synthetic benchmark: exponentially arrivals of compute bound
tasks, background load provided by playback of traces from PSC
Relative overheads < 10%
8/8/2019 Virtual Machines Dinda0803
16/49
16
Isnt It Going To Be Too Slow?
Virtualized NICs have very similarbandwidth, slightly higher latencies J. Sugerman, G. Venkitachalam, B-H Lim, Virtualizing I/O Devices
on VMware Workstations Hosted Virtual Machine Monitor,
USENIX 2001
Disk-intensive workloads (kernel build,
web service): 30% slowdown S. King, G. Dunlap, P. Chen, OS support for Virtual Machines,
USENIX 2003
8/8/2019 Virtual Machines Dinda0803
17/49
17
Virtuoso
Approach: Lower level of abstraction
Raw machines, not processes
Mechanism: Virtual machine monitors
Our Focus: Middleware support to hide complexity
Ordering, instantiation, migration of machines
Virtual networking and remote devices
Connectivity to remote files, machines
Information services
Monitoring and prediction
Resource control
8/8/2019 Virtual Machines Dinda0803
18/49
18
The Virtuoso Model
1. User orders raw machine(s) Specifies hardware and performance
Basic software installation available OS, libraries, licenses, etc.
2. Virtuoso creates raw image and returnsreference Image contains disk, memory, configuration, etc.
3. User powers up machine
4. Virtuoso chooses provider
Information service5. Virtuoso migrates image to provider
Efficient network transfer rsync, demand paging, versioned filesystems
8/8/2019 Virtual Machines Dinda0803
19/49
19
The Virtuoso Model
6. Provider instantiates machine Virtual networking ties machine back to users
home network
Remote device support makes users desktopsdevices available on remote VM
Remote display support gives user the console ofthe machine (VNC)
Resource control to give user expectedperformance
7. User goes to his network admin to get
address, routing for his new machine8. User customizes machine
Feeds in CDs, floppies, ftp, up2date, etc.
8/8/2019 Virtual Machines Dinda0803
20/49
20
The Virtuoso Model
9. User uses machine Shutdown, hibernate, power-off, throw away
10. Virtuoso continuously monitors and adapts Various mechanisms, all invisible to user
Migrating the machine
Routing traffic between machines
Virtual network topology
Predictive scheduling versus reservations
Various goals Price
Interactivity Information service
Resource monitoring and prediction
8/8/2019 Virtual Machines Dinda0803
21/49
21
Outline
Motivation
Virtuoso Model
Virtual networking and remote devices
Information services
Resource measurement and prediction
Resource control
Related work
ConclusionsR. Figueiredo, P. Dinda, J. Fortes,
A Case For Grid Computing on
Virtual Machines, ICDCS 2003
8/8/2019 Virtual Machines Dinda0803
22/49
22
Why Virtual Networking?
A machine is suddenly plugged intoyour network. What happens?
Does it get an IP address?
Is it a routeable address? Does firewall let its traffic through?
To any port?
How do we make virtual machine hostile
environments as friendly as the users LAN?
8/8/2019 Virtual Machines Dinda0803
23/49
23
A Layer2 Virtual Network (VLAN)
for the Users Virtual Machines
Why Layer2?
Protocol agnostic
Mobility
Simple to understand
Ubiquity of Ethernet on end-systems
What about scaling?
Number of VMs limited Hierarchical routing possible because MAC
addresses can be assigned hierarchically
8/8/2019 Virtual Machines Dinda0803
24/49
24
A Simple Layer2 Virtual Network
Client Server
Remote VM
Physical
NIC
VM monitor
Virtual
NIC
Physical
NIC
SSH
Hostile Remote NetworkFriendly Local Network
8/8/2019 Virtual Machines Dinda0803
25/49
25
A Simple Layer2 Virtual Network
Client Server
Remote VM
Physical
NIC
VM monitor
Virtual
NIC
Physical
NIC
SSH
Hostile Remote NetworkFriendly Local Network
8/8/2019 Virtual Machines Dinda0803
26/49
26
A Simple Layer2 Virtual Network
Client Server
Remote VM
Physical
NIC
VM monitorBridged Bridged
Virtual
NIC
Physical
NIC
SSH Tunnel
Or SSL TCP
Hostile Remote NetworkFriendly Local Network
8/8/2019 Virtual Machines Dinda0803
27/49
27
An Overlay Network
Bridgeds and connections form anoverlay network for routing traffic among
virtual machines and the users home
network Links can trivially be added or removed
8/8/2019 Virtual Machines Dinda0803
28/49
28
Bootstrapping the Virtual Network
Star topology always possible TCP session from client must have been possible
Better topology may be possible
Depends on security at each site
Topology may change Virtual machines can migrate
Bootstrap to higher layers Virtual filesystems
8/8/2019 Virtual Machines Dinda0803
29/49
29
Remote Devices
Client Server
Remote VM
VM monitornbd-server nbd-client
Virtual
CDROM
SSH Tunnel
Or SSL TCP
Linux Network Block Device Driver
/dev/cdrom /dev/nb0
VMWare CD Image
Physical
CDRO
M
8/8/2019 Virtual Machines Dinda0803
30/49
30
Extending a Grid Information Service
(GIS) to Support Virtual Machines
A GIS contains information about the availableresources in a grid Hosts, routers, switches, software, etc.
URGIS project at Northwestern GIS based on the relational data model
Compositional queries (joins) to find collections ofresources.
Find physical machines which can instantiate a virtual machinewith 1 GB of memory
Find sets of four different virtual machines on the same networkwith a total memory between 512 MB and 1 GB
Nondeterministic query extension for scalability
http://www.cs.northwestern.edu/~urgis
8/8/2019 Virtual Machines Dinda0803
31/49
31
The RGIS Design (Per Site)
Oracle 9i Back End
Windows, Linux, Parallel Server, etc
Oracle 9i Front End
transactional inserts and updates
using stored procedures,
queries using select statements
(uses databases access control)
Update
Manager
Web Interface
Content Delivery
Network Interface
For loose consistency
Query Manager
and Rewriter
Users
Schema, type hierarchy, indices,
PL/SQL stored procedures
for each object
Applications
RDBMS
Use of Oracle
is nota
requirementof approach
site-to-site (tentative)
Updates encryptedusing asymmetriccryptography onnetwork. Only thosewith appropriate keyshave access
Authenticated Direct Interface
SOAP Interface
8/8/2019 Virtual Machines Dinda0803
32/49
32
Motivation for Non-deterministic Queries
Queries forcompositions of resources easily expressed inSQL:
But such queries can be very expensive to execute
However, we typically dont need the entire result set, justsome rows, and not always the same ones
And we need them in a bounded amount of time
Approach: return random sample of result set
Find 2 hosts with Linux that
together have 3 GB of RAM
select
h1.insertid, h2.insertid
from
hosts h1, hosts h2
whereh1.os=LINUX and h2.os=LINUX and
h1.mem_mb+h2.mem_mb>=3072
8/8/2019 Virtual Machines Dinda0803
33/49
33
Implementing non-deterministic queriesselect nondeterministically
h1.insertid, h2.insertid
fromhosts h1, hosts h2
where
h1.os=LINUX and h2.os=LINUX and
h1.mem_mb+h2.mem_mb>=3072
within
2seconds
SELECT
H1.INSERTID, H2.INSERTIDFROM
HOSTS H1, HOSTS H2 ,
INSERTIDS TEMP_H1 , INSERTIDS TEMP_H2
WHERE
(H1.OS='LINUX' AND H2.OS='LINUX' AND
H1.MEM_MB+H2.MEM_MB>=3072) AND(H1.INSERTID=TEMP_H1.INSERTID AND
TEMP_H1.rand > 982663452.975047AND
TEMP_H1.rand 1877769069.94039AND
TEMP_H2.rand
8/8/2019 Virtual Machines Dinda0803
34/49
34
Deadlines
Climbing Climbing+Hard Limiting Estimation Estimation+Hard Limiting0
0.5
1
1.5
2
2.5
Mechanism
Target Deadline
8/8/2019 Virtual Machines Dinda0803
35/49
35
P. Dinda, D. Lu,
Nondeterministic
Queries in a RelationalGrid Information
Service, SC 2003
D. Lu, P. Dinda,
Synthesizing RealisticComputational Grids,
SC 2003
D. Lu, P. Dinda, J.
Skicewicz, Scoped andApproximate Queries in
a Relational Grid
Information Service,
Grid 2003
8/8/2019 Virtual Machines Dinda0803
36/49
36
Extending a Grid Information Service
(GIS) to Support Virtual Machines
Virtual indirection Each RGIS object has a unique id
Virtualization table associates unique id of virtualresources with unique ids of their constituentphysical resources
Virtual nature of resource is hidden unless queryexplicitly requests it
Futures An RGIS object that does not exist yet
Futures table of unique ids
Future nature of resource hidden unless queryexplicitly requests it
8/8/2019 Virtual Machines Dinda0803
37/49
37
Extending a Resource Monitoring and Prediction
System to Support Virtual Machines
Measuring and predicting dynamic resource availabilityto support adaptation
Virtual machine migration
Routing on the virtual network
Application-level adaptation
RPS System at Northwestern
Host and network measurements for Unix and Windows
Emphasis on prediction (wide range of linear and nonlinear
models) and communication (wide range of transports)
P. Dinda, Online Prediction
of the Running Time of
Tasks, Journal of Cluster
Computing, 2002
P. Dinda,A Prediction-based
Real-time Scheduling Advisor,
IPDPS 2002
J. Skicewicz, P. Dinda, J.
Schopf, Multiresolution
Resource Behavior Queries
using Wavelets, HPDC 2001
8/8/2019 Virtual Machines Dinda0803
38/49
38
RPS Toolkit
Extensible toolkit for implementing resourcesignal prediction systems [CMU-CS-99-138]
Growing: RTA, RTSA, Wavelets, GUI, etc
Easy buy-in for users
C++ and sockets (no threads) Prebuilt prediction components
Libraries (sensors, time series, communication)
http://www.cs.northwestern.edu/~RPS
8/8/2019 Virtual Machines Dinda0803
39/49
39
Example: Multiscale Network Prediction
Large, recent study of predictability Hundreds of NLANR and other traces
Mostly WANs
Different resolutions Binning and low-pass via wavelets
Sweet Spot Predictability often
maximized at particular resolution
Y. Qiao, J. Skicewicz, P.
Dinda,Multiscale
Predictability ofNetworkTraffic, NWU-CS-02-13
8/8/2019 Virtual Machines Dinda0803
40/49
40
Multiresolution Network Prediction
0
0.05
0.1
0.15
0.
0. 5
0.
0.1 1 10 100 1000
Bin i e econds
l st
8
8
r 8
r 32
r 4,4)
ari a(4,1,4)
ari a(4,2,4)
ar i a(4,-1,4)
8/8/2019 Virtual Machines Dinda0803
41/49
41
Goal: monitor physical machine and
infer behavior inside of virtual machine
Current approach: /proc on physical
machine to slowdown on resource rate
in virtual machine
ARX models
Causality problem
Extending a Resource Prediction System
to Support Virtual Machines
8/8/2019 Virtual Machines Dinda0803
42/49
42
Resource Control
Owner has an interest in controlling howmuch and when compute time is given
to a virtual machine
Our approach: A language forexpressing these constraints, and
compilation to real-time schedules,
proportional share, etc.
Very early stages. Trying to avoid
kernel modifications.
8/8/2019 Virtual Machines Dinda0803
43/49
43
How to Control: User Irritation Project
Measure interactiveuser tolerance to
resource stealing
Conversely, what
service must beprovided to
interactive users?
Irritation@Home
8/8/2019 Virtual Machines Dinda0803
44/49
44
Outline
Motivation Virtuoso Model
Virtual networking and remote devices
Information services
Resource measurement and prediction
Resource control
Related work
ConclusionsR. Figueiredo, P. Dinda, J. Fortes,
A Case For Grid Computing on
Virtual Machines, ICDCS 2003
8/8/2019 Virtual Machines Dinda0803
45/49
45
Related Work Collective / Capsule Computing (Stanford)
VMM, Migration/caching,Hierarchical image files
Denali (U. Washington) Highly scalable VMMs (1000s of VMMs per node)
CoVirt (U. Michigan)
Xenoserver (Cambridge)
SODA (Purdue)
Virtual Server, fast deployment of services Internet Suspend/Resume (Intel Labs Pittsburgh)
Ensim Virtual Server, widely used for web site hosting
WFQ-based resource control released into open-source Linux kernel
Virtouzzo (SWSoft) Ensim competitor
Available VMMs: IBMs VM, VMWare, Virtual PC/Server, Plex/86,SIMICS, Hypervisor, DesQView/TaskView. VM/386
8/8/2019 Virtual Machines Dinda0803
46/49
46
Current Status (At Northwestern)
Bridged components done Mechanism for virtual networking
No policy yet
Very preliminary system for acquiring and
instantiating VMs done RGIS schema extensions done
Work In Progress Remote devices (management)
Virtual networking (policy + adaptation) VM Monitoring using RPS
User Irritation
8/8/2019 Virtual Machines Dinda0803
47/49
47
For More
Information
Prescience Lab (Northwestern University)
http://plab.cs.northwestern.edu
ACIS (University of Florida)
http://acis.ufl.edu
R. Figueiredo, P. Dinda, J. Fortes,
A Case For Grid Computing on
Virtual Machines, ICDCS 2003
N d t i i ti f
8/8/2019 Virtual Machines Dinda0803
48/49
48
Nondeterministic query performance
0.1
1
10
100
1
10
100
1000
10000
100000
1000000
0.0001 0.001 0.002
Selection Probability
Query Time
Number of
Results
Select two hosts that together have >3GB of RAM
500,000 host grid generated by GridG
Memory distribution according to Smith study of MDS contents
Dual Xeon 1 GHz, 2 GB, 240 GB RAID, RGIS2, Oracle 9i Enterprise
Average of five trials
Meaningful tradeoff between
query processing time and
result set size is possible
N d t i i ti f
8/8/2019 Virtual Machines Dinda0803
49/49
49
Nondeterministic query performance
Select n hosts that together have >3GB of RAM
500,000 host grid generated by GridG
Memory distribution according to Smith study of MDS contents
Dual Xeon 1 GHz, 2 GB, 240 GB RAID, RGIS2, Oracle 9i Enterprise
Average of five trials
0.1
1
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Number ofHosts
uery Time
Number of
Results
p=0 .0001
p=0 .00001
p=0 .000005
p=0 .00000001
Can use tradeoff to control
query time independent of
query complexity