Top Banner
Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David Wolinsky ACIS Lab - University of Florida
37

Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Jan 01, 2016

Download

Documents

Steven Hancock
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory

Virtual Appliances for Training and Education in FutureGrid

Renato FigueiredoArjun Prakash, David WolinskyACIS Lab - University of Florida

Page 2: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 2

Education and TrainingEducation and Training

Importance of experimental work in systems research• Needs also to be addressed in education

• Complement to fundamental theory

FutureGrid: a testbed for experimentation and collaboration• Education and training contributions:

• Lower barrier to entry – pre-configured environments, zero-configuration technologies

• Community/repository of hands-on executable environments: develop once, share and reuse

Page 3: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 3

Goals and ApproachGoals and Approach

A flexible, extensible platform for hands-on, lab-oriented education on FutureGrid

Focus on usability – lower entry barrier• Plug and play, open-source

• Seamlessly work on local, cloud resources

Virtualization + social networking to create educational sandboxes• Virtual “Grid” appliances: self-contained, pre-

packaged execution environments

• Group VPNs: simple management of virtual clusters by students and educators

Page 4: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 4

Outline

Virtual appliances GroupVPN Virtual cluster configuration Example: MPI appliance

Page 5: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 5

What is an appliance?

Physical appliances• Webster – “an instrument or device designed

for a particular use or function”

Page 6: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 6

What is an appliance?

Hardware/software appliances• TV receiver + computer + hard disk + Linux +

user interface

• Computer + network interfaces + FreeBSD + user interface

Page 7: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 7

What is a virtual appliance?

An appliance that packages software and configuration needed for a particular purpose into a virtual machine “image”

The virtual appliance has no hardware – just software and configuration

The image is a (big) file

It can be instantiated on hardware

Page 8: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 8

Virtual appliance example

Linux + Apache + MySQL + PHP

copy

instantiate

LAMPimage

A web server Another Web server

Repeat…

VirtualizationLayer

Page 9: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 9

Training/education in clusters

Replace LAMP with the middleware of your choice – e.g. MPI, Hadoop, Condor

copy

instantiate

MPIimage

An MPI workerAnother MPI worker

Repeat…

VirtualizationLayer

Page 10: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 10

What about the network?

Multiple Web servers might be completely independent from each other

MPI nodes are not• Need to communicate and coordinate with

each other

• Each worker needs an IP address, uses TCP/IP sockets

Cluster middleware stacks assume a collection of machines, typically on a LAN (Local Area Network)

Page 11: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 11

Enter virtual networks

Physical machines

Switched

network

NOWs, COWs “WOWs”

•Wide-area

•Virtual machines (VMs)

•Self-organizing overlay

IP tunnels, P2P routing

Installation

image

Virtual machinesVM image

•Local-area

•Physical machines

•Self-organizing switching

(e.g. Ethernet spanning tree)

Page 12: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 12

Virtual cluster appliances

Virtual appliance + virtual network

copy

instantiate

MPI+

VirtualNetwork An MPIworker Another MPI worker

Repeat…

Virtual machine

Virtual network

Page 13: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 13

Virtual network architecture

Application

VNIC

VirtualRouter

VirtualRouter

VNIC

Application

(Wide-area)Overlay network

Isolated, private virtualaddress space

10.10.1.2

10.10.1.1

Unmodified applicationsConnect(10.10.1.2,80)

Capture/tunnel, scalable,resilient, self-configuringrouting and object store

Page 14: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 14

Virtual appliance clustersVirtual appliance clusters

Virtual appliances• Encapsulate software environment in image

• Virtual disk file(s) and virtual hardware configuration

The Grid appliance • Encapsulates cluster software environments

• Current examples: Condor, MPI, Hadoop

• Homogeneous images at each node

• Virtual LAN connecting nodes forms a cluster

• Deploy within or across domains

Page 15: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 15

Grid appliance internals

Host O/S• Linux

Grid/cloud stack• MPI, Hadoop, Condor, …

Glue logic for zero-configuration• Automatic DHCP address assignment

• Multicast DNS (Bonjour, Avahi) resource discovery

• Shared data store - Distributed Hash Table

• Interaction with VM/cloud

Page 16: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 16

Example – FutureGrid

Nimbus

Eucalyptus

Appliance

imageEducationTraining

Page 17: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 17

Example - Archer

2. Create/joinVPN groupDownload config

Free pre-packaged ArcherVirtual appliances - run on free VMMs (VMware, VirtualBox, KVM)

CMS, Wiki, YouTube: Community-contributedcontent: applications,datasets, tutorials

Archer seed resources450 cores, 5 sites

Archer Global Virtual Network

Condor schedulerNFS file systems

1: Downloadappliance

3. Boot appliancesAutomatic connection to groupVPN – self-configuring DHCP

Free pre-packaged ArcherVirtual appliances - run on free VMMs (VMware, VirtualBox, KVM)

Community-contributedcontent: applications,datasets, tutorials

Archer Global Virtual Network

Middleware:Condor schedulerNFS file systems

1: Downloadappliance

Page 18: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 18

One appliance, multiple hosts

Allow same logical cluster environment to instantiate on a variety of platforms• Local desktop, clusters; FutureGrid; Amazon

EC2; Science Clouds…

Avoid dependence on host environment• Make minimum assumptions about VM and

provisioning software• Desktop: 1 image, VMware, VirtualBox, KVM

• Para-virtualized VMs (e.g. Xen) and cloud stacks – need to deal with idiosyncrasies

• Minimum assumptions about networking• Private, NATed Ethernet virtual network interface

Page 19: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 19

Virtual network: GroupVPN

Key techniques: • IP-over-P2P (IPOP) tunneling

• GroupVPN Web 2.0/social network interface

Self-configuring• Avoid administrative overhead of typical VPNs

• NAT and firewall traversal; DHCP virtual addresses

Scalable and robust• P2P routing deals with node joins and leaves

Networks are isolated• One or more private IP address spaces

• Decentralized DHCP serves addresses for each space

Page 20: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 20

GroupVPN Overview

Alice

CarolBob

SocialNetworkWeb interface

Social network(e.g. XMPP,group site

Overlay network(IPOP)

node0.ipop10.10.0.2

node1.ipop10.10.0.3

SocialNetwork API

Messaging layer/information system

Alice’s public keysBob’s public keysCarol’s public key

Bootstrapping private links throughWeb 2.0 interfaces and IP-over-P2P overlay tunneling

node2.ipop

Page 21: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 21

GroupVPN Web interface

Users can request to join a group, or create their own VPN group• E.g. instructor creates a GroupVPN for class

• Determines who is allowed to connect to virtual network

Owner can authorize users to join, remove users, authorize other to admin• Actions typical of a certificate authority

happen in the back-end without user having to deal with security operations

• E.g. sign/revoke a VPN X.509 certificate

Page 22: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 22

GroupVPN architecture

Application

VNIC

VirtualRouter

VirtualRouter

VNIC

Application

GroupVPNoverlay

“Tap” devices

10.10.1.2

10.10.1.1

Grid/cloud Middleware, apps

GroupVPN router

Page 23: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 23

Bi-directional structured overlay (Brunet library) Self-configured NAT traversal Self-optimized links

Direct, relay Self-healing structure

Multi-hoppath Overlay

router

Under the hood: overlay architecture

Overlayrouter

Directpath

Page 24: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 24

Deploying virtual clusters

Same image, per-group VPNs

copy

instantiate

Hadoop+

VirtualNetwork A Hadoop worker Another Hadoop worker

Repeat…

Virtual machine

GroupVPN

GroupVPNCredentials

(fromWeb site)

Virtual IP - DHCP10.10.1.1

Virtual IP - DHCP10.10.1.2

Page 25: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 25

Configuration framework

At the end of GroupVPN initialization:• Each node of a private virtual cluster gets a

DHCP address on virtual tap interface

• A barebones cluster

• Additional configuration required depending on middleware• Which node is the Condor negotiator? Hadoop

front-end? Which nodes are in the MPI ring?

Key frameworks used:• IP multicast discovery over GroupVPN

• Front-end queries for all IPs listening in GroupVPN

• Distributed hash table• Advertise (put key,value), discover (get key)

Page 26: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 26

Configuring and deploying groups

Generate virtual floppies• Through GroupVPN Web interface

Deploy appliances image(s)• FutureGrid (Nimbus/Eucalyptus), EC2

• GUI or command line tools

• Use APIs to copy virtual floppy to image

Submit jobs; terminate VMs when done

Page 27: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 27

FutureGrid example - Nimbus

Example using Nimbus:

workspace.sh --deploy --mdUserdata /tmp/floppy-worker.zip.b64 --service https://f1r.idp.ufl.futuregrid.org:8443/wsrf/services/WorkspaceFactoryService --file /tmp/output.xml --metadata /tmp/grid-appliance.xml --deploy-mem 1000 --deploy-duration 100 --trash-at-shutdown Trash --exit-state Running --displayname grid-appliance --sshfile /home/renato/.ssh/id_dsa.pub

GroupVPN floppy imageNimbus service endpoint

Metadata – points to image on Nimbus

server

SSH public key to log in to instance

Page 28: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 28

Summary

Hands-on experience with clusters is essential for education and training

Virtualization, clouds simplify software packaging/configuration

Grid appliance allows users to easily deploy hands-on virtual clusters

FutureGrid provides resources and cloud stacks for educators to easily deploy their own virtual clusters

Goal - towards a community-based marketplace of educational appliances for TeraGrid

Page 29: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 29

Thank you!

More information:• http://www.futuregrid.org

• http://grid-appliance.org

This document was developed with support from the National Science Foundation (NSF) under Grant No. 0910812 to Indiana University for "FutureGrid: An Experimental, High-Performance Grid Test-bed." Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF

Page 30: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 30

Page 31: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 31

Local appliance deployments

Two possibilities:• Share our “bootstrap” infrastructure, but run a

separate GroupVPN• Simplest to setup

• Deploy your own “bootstrap” infrastructure• More work to setup

• Especially if across multiple LANs

• Potential for faster connectivity

Page 32: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 32

PlanetLab bootstrap Shared virtual network bootstrap

• Runs 24/7 on 100s of machines on the public Internet

• Connect machines across multiple domains, behind NATs

Page 33: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 33

PlanetLab bootstrap: approach

Create GroupVPN and GroupAppliance on the Grid appliance Web site

Download configuration floppy Point users to the interface; allow users

you trust into the group Trusted users can download

configuration floppies and boot up appliances

Page 34: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 34

Private bootstrap: General approach

Good choice for single-domain pools Create GroupVPN and GroupAppliance

on the Grid appliance Web site Deploy a small IPOP/GroupVPN

bootstrap P2P pool• Can be on a physical machine, or appliance

• Detailed instructions at grid-appliance.org

The remaining steps are the same as for the shared bootstrap

Page 35: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 35

Connecting external resources

GroupVPN can run directly on a physical machine, if desired• Provides a VPN network interface

• Useful for example if you already have a local Condor pool• Can “flock” to Archer

• Also allows you to install Archer stack directly on a physical machine if you wish

Page 36: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 36

FutureGrid example - Eucalyptus

Example using Eucalyptus (or ec2-run-instances on Amazon EC2):

euca-run-instances ami-fd4aa494 -f floppy.zip --instance-type m1.large -k keypair

GroupVPN floppy image

Image ID on Eucalyptus server

SSH public key to log in to instance

Page 37: Advanced Computing and Information Systems laboratory Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David.

Advanced Computing and Information Systems laboratory 37

Where to go from here?

Tutorials on FutureGrid and Grid appliance Web sites for various middleware stacks• Condor, MPI, Hadoop

A community resource for educational virtual appliances• Success hinges on users effectively getting

involved

• If you are happy with the system, let others know!

• Contribute with your own content – virtual appliance images, tutorials, etc