Open Cloud Consortium: An Update (04-23-10, v9)

Post on 12-May-2015

2002 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

This is an overview of the Open Cloud Consortium that I gave at Cloud Lab '10 on April 23, 2010.

Transcript

Open Cloud Consortium: An Update

Robert GrossmanOpen Cloud Consortium

April 23, 2010

www.opencloudconsortium.org

Part 1.

Overview of theOpen Cloud Consortium (OCC)

www.opencloudconsortium.org

2

501(3)(c) Not-for-profit corporation Supports the development of standards,

interoperability frameworks, and reference implementations.

Manages testbeds: Open Cloud Testbed and Intercloud Testbed.

Manages cloud computing infrastructure to support scientific research: Open Science Data Cloud.

Develops benchmarks.

3

www.opencloudconsortium.org

OCC Members

Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo

Universities: CalIT2, Johns Hopkins, MIT Lincoln Lab, Northwestern Univ., University of Illinois at Chicago, University of Chicago

Government agencies: NASA Open Source Projects: Sector Project

4

OCC Working Groups

1. Large Data Cloud Working Group2. Open Cloud Testbed Working Group.3. Intercloud Testbed Working Group4. Open Science Data Cloud Working Group

Storage Services

Compute Services

Applications

Virtual Network Manager

Data Services

Network Transport

Virtual Machine Manager

Metadata Services

Identity Manager

IaaS

PaaS

Apps

Part 2. Intercloud Testbed

7

Cloud 1

Cloud 2

We have several cloud standards…

Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)

Platform as a Service– Cloud Compute Services– Data/Table Cloud Services– Cloud Storage Services

Open Virtualization Format (OVF)

Open Cloud Computing Interface (OCCI)

SNIA Cloud Data Management Interface (CDMI)

Large Data Cloud Interoperability Framework

Where are the Gaps?

Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)– Physical Resources

Platform as a Service– Cloud Compute Services– Data as a Service

Open Virtualization Format (OVF)

Open Cloud Computing Interface (OCCI)

SNIA Cloud Data Management Interface (CDMI)

Large Data Cloud Interoperability Framework

Naming entities in IaaS & PaaS Bridging IaaS & DaaS Services that span multiple VMs, ….

Bridging the Gaps…A Small Step

Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)– Physical Resources

Platform as a Service– Cloud Compute Services– Data as a Service

Open Virtualization Format (OVF)

Open Cloud Computing Interface (OCCI)

SNIA Cloud Data Management Interface (CDMI)

Large Data Cloud Interoperability Framework

Metadata service linking IaaS and DaaS

Metadata service naming and linking entities in the IaaS layers

Part 3. Large Data Cloud Working Group

11

Standards for integrating and interoperating large data cloud services such as those provided by Hadoop and similar systems.

Focus of Working Group

12

Cloud Storage Services

Cloud Compute Services (MapReduce, UDF, & other programming frameworks)

Table-based Data Services

Relational-like Data Services

App App App App App

App App

App App

Developing APIs for this framework.

Benchmarks for Large Data Clouds

Until recently, the only benchmark used was Terasort (sorting 10 billion 100 byte records)

Replaced by Gray Sort and Minute Sort Gray Sort tries to maximize TB / min sorted on

100 TB or more of data. Hadoop holds the current Gray Sort and

Minute Sort records. Problem: sort is just one of the types of work

load for analytic applications

MalStone

MalGen – generates synthetic data with realistic distributions.

MalStone A & B – “stylized” computations that can be used as benchmarks for architectures, software and systems for large data clouds.

Open source and available at malgen.googlecode.com

14

Part 4. Open Cloud Testbed

Condominium Clouds In a condominium cloud, you buy your own rack

or bunch of racks. The racks are managed and operated by the

condominium association, in this case the OCC. If your rack is 120 TB, you get the rights to c. 40

TB of storage in the cloud. The rest is a shared resource.

The Open Cloud Testbed is a condo cloud managed by the OCC.

16

Open Cloud Testbed

Phase 2 9 racks 250+ Nodes 1000+ Cores 10+ Gb/s

Phase 3 (2011) – we will stand up some 100 Gb/s links.

MREN

CENIC Dragon

Hadoop Sector/Sphere Thrift KVM VMs Eucalyptus VMs

C-Wave

Part 5. Open Science Data Cloud Working Group

18

Open Science Data Cloud

19

Astronomical dataBiological data (Bionimbus)

Networking data

Provide a long term home for selected scientific data sets and support elastic cloud-based analysis & integration of the data.

Part 6. Image Processing for Disaster Relief Using Elastic Clouds

The Challenge

When a disaster strikes, there is usually an immediate and critical need for computing power to process images.

Example, there was a delay getting current images of Haiti to non-government organizations (NGO) after earthquake on January 12, 2009.

The Idea …The OCC Elastic Cloud for Disaster Relief

Set up a permanent elastic cloud that is available to assist with disaster relief.

Establish connections to sources of images that can be enabled at times of need.

Set up a network of volunteers with accounts on the cloud and knowledge of the tools that can swarm when needed.

Use as a test of large data cloud standards and interoperability.

Image Processing on Large Data Clouds

Data parallel applications– Parallelism is often required at file or directory level– Data locality is important– Parallel disk IO is also critical

Requirements– The input data size can be at 10+ TB per day– Want to integrate with open source libraries such as

OSSIM

Distributed File Systems & Image Processing

Sector is broadly similar to the Hadoop Distributed File System

Main differences– Hadoop directly implements a distributed block based file

system– Sector is a layer over a native file system

Sector does not split files– A single image will not be split, therefore when it is being

processed, the application does not need to read the data from other nodes via network

– A directory can be kept together on a single node as well, as an option

Get Involved…

… Join our volunteer effort.

Part 7. Virtual Networks

How Long Does It Take…

… To Move A Cloud Application Spanning Multiple VMs Between Clouds?

… To Add A New Rack to a Cloud Service?

… To Add Another Public Cloud to A Private/Public Cloud?

We Have Several Ways of Defining Virtual Networks….

VN-Link

VLAN

VPNs

BGP

MPLSOpenFlow

Open vSwitch

vSwitchCloudSwitch

But No Vendor Neutral VN Standard That

That scales to 100,000+ VMs Supported by multiple vendors Spans multiple physical switches Supports VN Mobility Provides strong isolation of VN Is easy for VMs to join and leave VNs Includes management interfaces ….

For More Information

info@opencloudconsortium.orgwww.opencloudconsortium.org

top related