Top Banner
8/14/2019 Chandra Krintz http://slidepdf.com/reader/full/chandra-krintz 1/35 AppScale: Open-source Platform-as-a-Service (PaaS) System for Energy-Aware Cloud Computing Research Chandra Krintz Associate Professor Computer Science Dept and Institute for Energy Efficiency UCSB Nov. 18, 2009
35

Chandra Krintz

May 30, 2018

Download

Documents

Mikaela Mennen
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 1/35

AppScale:

Open-sourcePlatform-as-a-Service (PaaS)

System for Energy-AwareCloud Computing Research

Chandra KrintzAssociate Professor

Computer Science Dept andInstitute for Energy Efficiency

UCSB

Nov. 18, 2009

Page 2: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 2/35

Cloud Computing

• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”  

• On a rental basis 

SLAs

Web Services

Virtualization

-• Pay-per-use, flat rate – E-comerced based• Resources are opaque

Page 3: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 3/35

Cloud Computing

• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”   –  Culmination of grid/cluster/utility/elastic computing

 –  Advances in processor, virtualization, systems technology

• On a rental basis – via service-level agreements (SLAS) –  Pay-per-use, flat rate – E-comerced based

 –  Resources are opaque

• Have experienced a rapid uptake in the commercial sector –  Public clouds – you run your systems/apps on others’ systems

• Access small fraction of vast resource pools

 –  Offer availability guarantees and extreme scale

Page 4: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 4/35

Layers

• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”  

 –  Infrastructure, e.g. Amazon Web Services (AWS)• Provision isolated resources under contract

• ull-s ste i a es de lo ed o er irtual achine onitor

IaaS

 

 –  Platform, e.g. Google AppEngine (GAE), Microsoft Azure• Enable construction of network-accessible applications

• Complete software stack; Process-level runtime isolation

• Specialized/scalable runtime and library support

 –  Software, e.g. Salesforce• Remotely accessible and customizable applications

PaaS

SaaS

Page 5: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 5/35

Cloudy Issues

• Many institutions and companies own IT infrastructure

• Public cloud features also useful for “on-premise” clouds

 –  Privacy of code and data –  Avoids vendor “lock-in”, pay-per-use, and availability reliance

 –  Potential for hybrid and customized approaches

  – 

o en a or eas ng resource cons ra n s• Storage/data management, cpu/memory, communication

 –  Potential for investigating new energy-aware dist. systems

Public clouds are opaque – open APIs, closed implementation –  How can we research what’s next if we can’t see what’s now?

• Can cloud fabrics support other application domains,

services, performance/availability requirements?

Page 6: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 6/35

An Opening in the Clouds

• Open-source cloud computing systems from theUCSB Computer Science Department –  Goal: Bring popular cloud fabrics to “on-premise” clusters that

are easy to use and are transparent

 –  To facilitate investigation of• Ener -e icient cloud com utin 

 –  Services, underlying device technology, support technologies –  Customization (availability, performance, application behavior)

• Hybrid cloud solutions (public and on-premise)

Page 7: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 7/35

An Opening in the Clouds

• Open-source cloud computing systems from theUCSB Computer Science Department –  Goal: Bring popular cloud fabrics to “on-premise” clusters that

are easy to use and are transparent –  To facilitate investigation of

• Energy-efficient cloud computing  –  erv ce , un er y ng ev ce ec no ogy, u or ec no og e

 –  Customization (availability, performance, application behavior)• Hybrid cloud solutions (public and on-premise)

 –  By emulating key cloud layers from the commercial sector• Engender user community, access to real applications/users• Leverage extant software technologies

 –  Not replacement technology for any Public Cloud service

Page 8: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 8/35

UCSB Cloud Computing

• Software systems for accessing easily and transparentlyscalable CPU/storage/network resources via a networkconnection or web interface – “as-a-service”  

 –  Infrastructure, e.g. Amazon Web Services (AWS)• Eucalyptus: www.eucalyptus.com Prof. Rich Wolski

 

IaaS

 – 

a orm, e.g. oog e pp ng ne , croso zure• Enable construction of network-accessible applications• Complete software stack; Process-level runtime isolation• Specialized/scalable runtime and library support

 –  AppScale: appscale.cs.ucsb.edu• Web services based implementation of Google App Engine

 –  Complete application stack for MVC-based web applications –  Written in Python or Java

PaaS

Page 9: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 9/35

From Google App Engine

to AppScale• GAE is a full application stack that facilitates construction

of interactive webpages with a database backing store –  Users develop Python and Java apps using well defined APIs

• Execute, test, debug the apps locally using the Google open-source software development kit (SDK)

  – 

SDK implements simple, slow, non-scalable versions of APIs –  Generates indices that the application requires

Page 10: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 10/35

From Google App Engine

to AppScale• GAE is a full application stack that facilitates construction

of interactive webpages with a database backing store –  Users develop Python and Java apps using well defined APIs

• Execute, test, debug the apps locally using the Google open-source software development kit (SDK)

  – 

SDK implements simple, slow, non-scalable versions of APIs –  Generates indices that the application requires

• Upload application to Google: YOUR_APPNAME.appspot.com

 –  Sandboxed execution, language/behavior restrictions

 –  Resource quotas via free or for-fee –  Highly scalable API implementations (proprietary)

Page 11: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 11/35

From Google App Engine

to AppScale• GAE is a full application stack that facilitates construction

of interactive webpages with a database backing store –  Users develop Python and Java apps using well defined APIs

• Highly scalable proprietary implementation on Google resources

Datastore -> Bi table/Ma reduce 

MemCache -> in-memory datastoreAuthentication -> Google Accounts

Mail -> GMail

URL-Fetch (for HTTP/S communication)

Images

Task Queues for short background jobs

Page 12: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 12/35

AppScale

• An Open-source platform-as-a-service (PaaS) system thatemulates Google App Engine (GAE) –  Extension of GAE SDK with API implementations replacedAppLoadBalancer (ALB) Database Master/Peer (DB M/P)

AppServer (AS) Database Slave/Peer (DB S/P)

GAE App

Developer

(AppScale Admin)

GAE App

Users

AppScale

tools

HTTPS

App

Controller

ALB

DB M/P

DB S/P

ASGAE App

UsersGAE App

Users

AppScale Cloud

Page 13: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 13/35

An AppScale Image

• Implements every AppScale component –  Can instantiate as a particular role (ALB, AS, DB)

 –  Can change functionality and instantiate itself as another• AppLoadBalancer (ALB)

 –  Controller of the system (Ruby-on-Rails application)

  – 

Starts other instances and instantiates each with its role –  Monitors other components once instantiated

 –  Restarts lost components

 –  Current work: ALB availability (shadow ALB(s))

 –  GAE developers contact ALB to manage cloud• Via the AppScale Tools

• Stop/start cloud, upload/start/terminate GAE apps

• Command line and web page administrator status w/ load stats

Page 14: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 14/35

An AppScale Image

• Implements every AppScale component –  Can instantiate as a particular role (ALB, AS, DB)

 –  Can change functionality and instantiate itself as another• AppLoadBalancer (ALB)

 –  Controller of the system (Ruby-on-Rails application)

  – 

Starts other instances and instantiates each with its role –  Monitors other components once instantiated

 –  Restarts lost components

• Monitors system statistics –  CPU, Memory, page requests, network, application behavior

 –  Grow and shrink AppScale cloud on-demand

Page 15: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 15/35

An AppScale Image

• Implements every AppScale component –  Can instantiate as a particular role (ALB, AS, DB)

 –  Can change functionality and instantiate itself as another• AppServer (AS)

 –  Web front-end for the GAE applications (GAE API compliant)

  – 

Users of GAE apps interact with ASs directly• After being routed to an AS instance by the ALB upon first

access to a GAE application (for load balancing purposes)

 –  Python GAE interface

 –  Java GAE interface

Page 16: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 16/35

An AppScale Image

• Implements every AppScale component –  Can instantiate as a particular role (ALB, AS, DB)

 –  Can change functionality and instantiate itself as another• AppServer (AS)

 –  Web front-end for the GAE applications (GAE API compliant)

  – 

Users of GAE apps interact with ASs directly• After being routed to an AS instance by the ALB upon first

access to a GAE application (for load balancing purposes)

 –  Python GAE interface

 –  Java GAE interface• Extensions (non-GAE compliant)

 –  MapReduce integration via an API

 –  GAE developers write their own mappers/reducers• AppScale schedules and load balances

Page 17: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 17/35

An AppScale Image

• Implements every AppScale component –  Can instantiate as a particular role (ALB, AS, DB)

 –  Can change functionality and instantiate itself as another• AppDBMaster (DBM) and AppDBSlaves (DBS)

 –  Scalable distributed database (Datastore API) implementation

 •

Type and replication specified at AppScale start by admin –  Bigtable like options (schema-free, key-value store)

• HBase (Java) and Hypertable (C++)

• MongoDB (Python)

 –  Peer-to-peer options• Cassandra (Java), Voldemort (Java)

 –  In-Memory, Distibuted – MemchacheDB (Python)

 – 

Relational options• MySQL (sharded, Master/Slave)

Page 18: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 18/35

AppScale

• Component interoperation –  Google protocol buffers for all app data communication

 –  Inter-language IPC via Thrift

• Deploys and executes automatically over different cloud

a rics an virtua ization ayers –  Eucalyptus

 –  Amazon’s EC2

 –  Xen

 –  KVM

 –  Soon to come…• VMWare/VCloud, IBM’s RC2, greater EC2 scale

Page 19: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 19/35

AppScale Potential

• Current status –  Stable system, many interesting database options

 –  International user community, real applications• AppScale is used for commercial & private endeavors

• Significant visibility and attention from blogosphere & industry

  – 

Web services application domain• W/ extension that spawns multi-language Map-Reduce jobs

• Research directions:

 –  Measurement and characterization of energy consumption• Multiple components, languages, resource requirements

• Parallelization and concurrency, multicore utilization

• Virtualization and IaaS layers

• Database technologies

Page 20: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 20/35

AppScale Research Roadmap

• Support for Java and additional DBs

• ASs and DBs grow and shrink according to load and failure –  Resource monitoring & allocation (SLA support)

• Automatic and dynamic renegotiation, improved scaling

• Per ormance ener and availabilit monitorin  –  Capture full-system behavior via sampling

• For debugging, performance/energy feedback, optimization

• Administrator/Developer control of

Replication of data for fault tolerance Scaling triggersType and amount of system monitoring Sandbox restrictions

Parallelism/Concurrency Energy/Power management

• Alternative computation models, e.g. streaming

Page 21: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 21/35

• Support for novel/emerging application domains –  Computationally and data intensive -- streaming, HPC, hybrids

IaaS + PaaS integration AppScale + Eucalyptus –  Resource allocation, specialization/customization

• Automatic re-negotiation of SLAs

• -

AppScale Research Directions

, ,

 –  Application-level feedback to/from physical power/air systems –  Isolation/energy tradeoffs, e.g. exploit shared memory

Multi-core server

VirtualizationTechnology (SW+HW)

Linux + EucalyptusAppScale

Applications/libraries

Page 22: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 22/35

Lessons• Open-source and management of a user community• Positives:

 –  Real applications, real systems

• If it works here, it really works –  Immediate feedback – helps debugging, evaluating efficacy of

more “theoretical” and “research” contributions, what is,

 –  Extensive visibility – helps with recruting, funding, impact

 –  Really fun and challenging

• Negatives –  Immediate feedback – can be quite demanding and time

consuming to respond to users effectively• Not responding equal to not putting something out as open source

• Visibility can also be a bad thing!

 –  Engineering oriented

Page 23: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 23/35

Cloud Computing at UCSB

Open-source implementations of popular cloud systems

• Platform-as-a-service (PaaS) framework

• Web services based implementation of 

Google AppEngine APIs

• Runs over Eucalyptus, Amazon EC2, and

• RACELab students:

Chris Bunch & Navraj Chohan,Nupur Garg, Matt Hubert,

Puneet Lakhina, Yiming Li,

• Implements multiple database backends(Hbase, Hypertable, Cassandra, MySQL,…)

• Real use, real users, real impact

• International user community

http://appscale.cs.ucsb.eduLead: Chandra Krintz

Thanks!

, ,

Michal Weigel Visiting Scholar:Yoshihide Nomura (Fujitsu)

• Supported by: Google,

National Science Foundation,IBM Research

Page 24: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 24/35

Cloud Computing at UCSB

Open-source implementations of popular cloud systems

• Infrastructure-as-a-service (IaaS) framework

Web services based implementation of elastic/utility/cloud computing infrastructure

• Linux image hosting via virtualization

• Emulates the Amazon AWS interface –

• Platform-as-a-service (PaaS) framework

Web services based implementation of Google AppEngine APIs

• Runs over Eucalyptus, Amazon EC2, and

virtualization layers (Xen/KVM)

•  

•Real use, real users, real impact

• Distributed with Ubuntu

• Large international user community

http://www.eucalyptus.com

Lead: Rich Wolski

 

(Hbase, Hypertable, Cassandra, MySQL,…)• Real use, real users, real impact

• International user community

http://appscale.cs.ucsb.edu

Lead: Chandra Krintz

Thanks!

• RACELab students: Chris Bunch, Jovan Chohan, Navraj Chohan,Nupur Garg, Matt Hubert, Puneet Lakhina, Yiming Li, GauravMehta, Nagy Mostafa, Soo Hwan Park, Michal WeigelVisiting Scholar: Yoshihide Nomura (Fujitsu)

• Supported by: Google, National Science Foundation,IBM Research, You!?

Page 25: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 25/35

Cloudy Issues

• Many institutions and companies own IT infrastructure

• Public cloud features also useful for “on-premise” clouds –  Privacy of code and data

 –  Avoids vendor “lock-in”, pay-per-use, and availability reliance

 –   

 –  Potential for easing resource constraints• Storage/data management, cpu/memory, communication

Public clouds are opaque – open APIs, closed implementation –  What is your application doing?

• Can cloud fabrics support other application domains,

services, performance/availability requirements?

Page 26: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 26/35

AppScale

• Component interoperation –  Component handshakes via socket messages

 – 

Google protocol buffers for all data communication –  Inter-language IPC via Thrift

 •

Deploys and executes automatically over different cloudfabrics and virtualization layers –  Eucalyptus

 –  Amazon’s EC2

 –  Xen

 –  KVM

 –  Currently laying the groundwork with VMWare for an

AppScale + VCloud/VSphere effort

Page 27: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 27/35

AppScale

• An Open-source platform-as-a-service (PaaS) system thatemulates Google App Engine (GAE)

 –  Extension of GAE SDK with API implementations replacedAppLoadBalancer (ALB) Database Master/Peer (DB M/P)

AppServer (AS) Database Slave/Peer (DB S/P)

GAE App

Developer

(AppScale Admin)

GAE App

Users

AppScale

tools

HTTPS

App

Controller

ALB

DB M/P

DB S/P

ASGAE App

UsersGAE App

Users

AppScale Cloud

MapReduce

Tasks

Page 28: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 28/35

AppScale

• Extension of GAE SDK with API implementations replacedDatastore -> HBase, Hypertable, Cassandra, Voldemort, MySQL,

MongoDB, MemCacheDB in progress: Oracle’s TimesTenMapReduce -> Hadoop MemCache -> Python & Java

Authentication -> built-in, decoupled from Google Accounts

GAE App

Developer

(AppScale Admin)

GAE App

Users

AppScale

tools

HTTPS

App

Controller

ALB

DB M/P

DB S/P

ASGAE App

UsersGAE App

Users

AppScale Cloud

MapReduce

Tasks

Page 29: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 29/35

RACELab Research

• Cross-language shared memory (Michal Weigel) –  Replace high overhead communication protocols transparently

for co-located runtimes• Multi-language profiling and optimization (Nagy Mostafa)

 –  Track changes in software revisions that impact performance

  –  ross- anguage spec a zat on o nterpreters

• Computationally intensive AppScale (Chris Bunch) –  MapReduce, HPC libraries

 –  Scheduling and load balancing

• Alternative data-intensive computational models in the cloud(Navraj Chohan) –  Cloud storage and persistance of data

 –  Streaming data management

Page 30: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 30/35

AppScale

Page 31: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 31/35

AppScale

Page 32: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 32/35

UCSB Open-Source Cloud Computing

• Eucalyptus Lead: Rich Wolski, MAYHEM Lab

 –  Elastic Utility Computing Architecture Linking YourPrograms T o Useful S ystems

 –  Infrastructure-as-a-service (IaaS) framework –  Web services based implementation of elastic/utility/cloud

computing infrastructure

 – 

Linux image hosting via virtualization –  Emulates the Amazon AWS interfaces

Page 33: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 33/35

• Interface is based on Amazon’s published WSDL –  EC2/S3/EBS

 – 

EC2 “Availability” zones correspond to individual clusters –  Uses either Amazon tools or Eucalyptus replacements

 –  Web services and REST interface

Does not assume that worker nodes have publicly routableIP addresses

 –  All cloud images have access to a private network interface –  Multiple networking options including elastic IPs and security

groups• WS-security for authentication

• Web-based administration and accounting

E l

Page 34: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 34/35

Eucalyptus

• Availability –  Via popular Linux distributions: Ubuntu, Debian, CentOS, …

Uptake –  19,400 downloads, > 71 countries

 –  May 2009: Key technology in thep at orm

 –  Fundamental technology of theUbuntu Oct. 2009 distribution

• 10,000,000 potential downloads

• Commercial spinoff: Eucalyptus Systems, Inc –  Support open source community www.eucalyptus.com

 –  Customized cloud services/components/scalability/SLAs

 –  Hybrid clouds - High availability (now available!)

A S l

Page 35: Chandra Krintz

8/14/2019 Chandra Krintz

http://slidepdf.com/reader/full/chandra-krintz 35/35

AppScale

• Open-source platform-as-a-service (PaaS) thatemulates Google App Engine (GAE)

• GAE is a full application stack that facilitates constructionof interactive webpages with a database backing store

 –  Sandbox / restrictionsPure Python or Java, no thread/subprocess spawning, system calls

No writes to file system, reads only to static files uploaded w/app

Storage using key-value, schema-free datastore (Bigtable-based)

HTTP/S communication only, CGI to handle page requests

Limit on number of datastore elements accessed per request

Limit on response duration, task frequency, request rate

Enforced quotas (BW, CPU, requests/s, files, app size, …)