Top Banner
13-17th June 2011 Nancy, France PhD Workshop, AIMS’11 SLACC SLACC SLA Support System for Cloud Computing Guilherme Sperb Machado, Burkhard Stiller Department of Informatics IFI, Communication Systems Group CSG, University of Zürich UZH machado | stiller@ifi uzh ch machado | stiller@ifi.uzh.ch Motivation and Problem Use Cases SLACC Process System Architecture © 2011 UZH, CSG@IFI
25
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Aims2011 SLA support system

13-17th June 2011Nancy, France

PhD Workshop, AIMS’11

SLACC

p,

SLACCSLA Support System for Cloud Computing

Guilherme Sperb Machado, Burkhard StillerDepartment of Informatics IFI, Communication Systems Group CSG,

University of Zürich UZHmachado | stiller@ifi uzh chmachado | [email protected]

Motivation and ProblemUse Cases

SLACC ProcessSystem Architecture

© 2011 UZH, CSG@IFI

Page 2: Aims2011 SLA support system

Motivation

“Companies struggling with Cloud performance” [1]– Survey from Compuware by Vanson Bourne:– Survey from Compuware, by Vanson Bourne:

© 2011 UZH, CSG@IFI 1Reference [1]: http://www.computerworlduk.com/news/cloud-computing/3239390/companies-struggling-with-cloud-performance/

Page 3: Aims2011 SLA support system

Motivation

“Companies struggling with Cloud performance” [1]– Survey from Compuware by Vanson Bourne:– Survey from Compuware, by Vanson Bourne:

57% European businesses were stopping f h i Cl d C i ilfurther investments on Cloud Computing until

they provide more specific guarantees

© 2011 UZH, CSG@IFI 1Reference [1]: http://www.computerworlduk.com/news/cloud-computing/3239390/companies-struggling-with-cloud-performance/

Page 4: Aims2011 SLA support system

Motivation

“Companies struggling with Cloud performance” [1]– Survey from Compuware by Vanson Bourne:– Survey from Compuware, by Vanson Bourne:

57% European businesses were stopping f h i Cl d C i ilfurther investments on Cloud Computing until

they provide more specific guarantees

72% of businesses: cloud platform was hampering their ability to maintain set levels of p g y

service, also affecting company’s revenue

© 2011 UZH, CSG@IFI 1Reference [1]: http://www.computerworlduk.com/news/cloud-computing/3239390/companies-struggling-with-cloud-performance/

Page 5: Aims2011 SLA support system

Motivation

“Companies struggling with Cloud performance” [1]– Survey from Compuware by Vanson Bourne:– Survey from Compuware, by Vanson Bourne:

57% European businesses were stopping f h i Cl d C i ilfurther investments on Cloud Computing until

they provide more specific guarantees

72% of businesses: cloud platform was hampering their ability to maintain set levels of p g y

service, also affecting company’s revenue

For revenue generating websites: performance would mean revenue

© 2011 UZH, CSG@IFI 1Reference [1]: http://www.computerworlduk.com/news/cloud-computing/3239390/companies-struggling-with-cloud-performance/

For revenue-generating websites: performance would mean revenue

Page 6: Aims2011 SLA support system

Cloud Provider Service SLA Parameters

S3 Availability (99.9%) with the following definitions: Error Rate, MonthlyUptime Percentage, Service Credit

Amazon EC2

Availability (99.95%) with the following definitions: Service Year: 365 daysof the year, Annual Percentage Uptime, Region Unavailable/Unavailability,Unavailable: no external connectivity during a five minute period, EligibleCredit Period, Service Credit

SimpleDBSubject to the Amazon Web Services Customer Agreement, since nospecific SLA is defined. Such agreement does not guaranteeavailability.

SalesForce CRM The company’s Web site does not contain information regarding SLAs forSalesForce CRM p y g gthis specific service.

GoogleGoogle Apps (inc. Gmail business, Google Docs, etc.)

Availability (99.9%) with the following definitions: Downtime, DowntimePeriod: 10 consecutive minutes downtime, Google Apps Covered Services,Monthly Uptime Percentage, Scheduled Downtime, Service, Service Credit.

Cloud Server

Availability regarding the following:Internal Network: 100%, Data Center Infrastructure: 100%Performance related to service degradation: Server Migration in case ofperformance problems: migration is notified 24 hours in advance, and is

l t d i 3 h ( i )Rackspace Cloud

completed in 3 hours (maximum).Recovery Time: In case of failure, guarantee the restoration/recovery in 1hour after the problem is identified.

Cloud Sites Availability, Unplanned Maintenance: 0%, Service Credit.

© 2011 UZH, CSG@IFI

Cloud Files Availability: 99.9%, Service Credit.

2

Page 7: Aims2011 SLA support system

ProblemCloud Provider Service SLA Parameters

S3 Availability (99.9%) with the following definitions: Error Rate, MonthlyUptime Percentage, Service Credit

Amazon EC2

Availability (99.95%) with the following definitions: Service Year: 365 daysof the year, Annual Percentage Uptime, Region Unavailable/Unavailability,Unavailable: no external connectivity during a five minute period, EligibleCredit Period, Service Credit

SimpleDBSubject to the Amazon Web Services Customer Agreement, since nospecific SLA is defined. Such agreement does not guaranteeavailability.

SalesForce CRM The company’s Web site does not contain information regarding SLAs forSalesForce CRM p y g gthis specific service.

GoogleGoogle Apps (inc. Gmail business, Google Docs, etc.)

Availability (99.9%) with the following definitions: Downtime, DowntimePeriod: 10 consecutive minutes downtime, Google Apps Covered Services,Monthly Uptime Percentage, Scheduled Downtime, Service, Service Credit.

Cloud Server

Availability regarding the following:Internal Network: 100%, Data Center Infrastructure: 100%Performance related to service degradation: Server Migration in case ofperformance problems: migration is notified 24 hours in advance, and is

l t d i 3 h ( i )Rackspace Cloud

completed in 3 hours (maximum).Recovery Time: In case of failure, guarantee the restoration/recovery in 1hour after the problem is identified.

Cloud Sites Availability, Unplanned Maintenance: 0%, Service Credit.

© 2011 UZH, CSG@IFI

Cloud Files Availability: 99.9%, Service Credit.

2

Page 8: Aims2011 SLA support system

Problem

Cloud Providers do not offer/guarantee– SLA specification tailored to Cloud Users’ interests– SLA specification tailored to Cloud Users interests

• Mostly, “Service Availability”

© 2011 UZH, CSG@IFI 3

Page 9: Aims2011 SLA support system

Problem

Cloud Providers do not offer/guarantee– SLA specification tailored to Cloud Users’ interests– SLA specification tailored to Cloud Users interests

• Mostly, “Service Availability”

Cloud Providers offering performance parameters

© 2011 UZH, CSG@IFI 3

Page 10: Aims2011 SLA support system

Problem

Cloud Providers do not offer/guarantee– SLA specification tailored to Cloud Users’ interests– SLA specification tailored to Cloud Users interests

• Mostly, “Service Availability”

Cloud Providers offering performance parameters– The solution is not obvious

• Huge size of Providers’ IT Infrastructure• High complexity with multiple inter-dependencies of resources

(physical or virtual)(physical or virtual)• Diversity of performance parameters

© 2011 UZH, CSG@IFI 3

Page 11: Aims2011 SLA support system

Solution Approach

SLACC: SLA Supporting System for Cloud Computing– Estimate SLA parameters (KPIs and SLOs) in a formalized– Estimate SLA parameters (KPIs and SLOs) in a formalized

methodology based on• Historical data (and the lack of data, as well)• IT infrastructure information (dependency between components)

– Focusing on performance parameters

The benefits:E h th l l f SLA ifi it– Enhance the level of SLA specificity

– Decision support in SLA negotiation processes (CPs)– Better knowledge of IT infrastructures’ capabilities

© 2011 UZH, CSG@IFI

Better knowledge of IT infrastructures capabilities

4

Page 12: Aims2011 SLA support system

Use Cases (1)

© 2011 UZH, CSG@IFI

CU: Cloud User/CustomerCP: Cloud Provider

5

Page 13: Aims2011 SLA support system

Use Cases (1)

© 2011 UZH, CSG@IFI

CU: Cloud User/CustomerCP: Cloud Provider

5

Page 14: Aims2011 SLA support system

Use Cases (1)

© 2011 UZH, CSG@IFI

CU: Cloud User/CustomerCP: Cloud Provider

5

Page 15: Aims2011 SLA support system

Use Cases (1)

Use Case

Part of SLACC

triggeringSLACC

© 2011 UZH, CSG@IFI

CU: Cloud User/CustomerCP: Cloud Provider

Part of SLACCSolution

5

Page 16: Aims2011 SLA support system

Use Cases (2)

SLACC handles typical Cloud Computing estimation cases in different levels (IaaS PaaS SaaS)cases in different levels (IaaS, PaaS, SaaS)– Response time of an operation (e.g., query data, insert new

customers) from a CRM application (Customer Relationship ) pp ( pManagement)

– Deployment time of a specific Virtual Machine template provided by the Cloud provider

– Backup time completion of several VM instancesMinimal bandwidth between VM instances (in different– Minimal bandwidth between VM instances (in different geographical localities)

– Minimal CPU processing capacity for a given VM

© 2011 UZH, CSG@IFI

p g p y g

6

Page 17: Aims2011 SLA support system

Use Cases (2)

SLACC handles typical Cloud Computing estimation cases in different levels (IaaS PaaS SaaS)cases in different levels (IaaS, PaaS, SaaS)– Response time of an operation (e.g., query data, insert new

customers) from a CRM application (Customer Relationship ) pp ( pManagement)

– Deployment time of a specific Virtual Machine template provided by the Cloud provider

– Backup time completion of several VM instancesMinimal bandwidth between VM instances (in different– Minimal bandwidth between VM instances (in different geographical localities)

– Minimal CPU processing capacity for a given VM

© 2011 UZH, CSG@IFI

p g p y g

6

Page 18: Aims2011 SLA support system

Use Cases (2)

Response time of an operation (e.g., query data, insert new customers) from a CRM application (Customernew customers) from a CRM application (Customer Relationship Management)– (example) Cloud Customer requires the information retrieval(example) Cloud Customer requires the information retrieval

in less than 1 second, having 50.000 clients at the database

– Composed of measurements:• time of distributing HTTP requests (load balancing distribution)

time that the application (CRM) can process the request• time that the application (CRM) can process the request• time of establishing a database connection• time to perform the SELECT on the “users table” (learned from

© 2011 UZH, CSG@IFI

populated databases)

7

Page 19: Aims2011 SLA support system

SLACC Process

SLACC Decision Support System

Input E ti t A l i

Cloud Operator

pDesigner Estimate Analysis

© 2011 UZH, CSG@IFI 8

Page 20: Aims2011 SLA support system

SLACC Process

SLACC Decision Support System

Input E ti t A l i

Cloud Operator

pDesigner Estimate Analysis

RTime that the Load

balancing distributes the R

Time that Applicationprocess R

Time to Est. DB Conn

Time to perform SELECT on Users table

1 0.012 0.123 0.050 1.150

2 0.056 0.100 0.073 1.012

3 0.023 0.223 0.098 1.344

4 0.028 0.145 0.012 0.983

5 0 043 0 245 0 033 0 974

© 2011 UZH, CSG@IFI

5 0.043 0.245 0.033 0.974

… …. … … …

8

Page 21: Aims2011 SLA support system

SLACC Process

SLACC Decision Support System

Input E ti t A l i

Cloud Operator

pDesigner Estimate Analysis

Correlation:- Correlate some variables to check how significant they are for the estimate- E.g., the time of load balancing of HTTP g , gRequests has an influence of X%

Regression:- Come up with a function that also gives

© 2011 UZH, CSG@IFI

p gpoint interval values

Hypothesis testing8

Page 22: Aims2011 SLA support system

SLACC Process

SLACC Decision Support System

Input E ti t A l i

Cloud Operator

pDesigner Estimate Analysis

Correlation:- Correlate some variables to check how significant they are for the estimate- E.g., the time of load balancing of HTTP g , gRequests has an influence of X%

Regression:- Come up with a function that also gives

on-going work

© 2011 UZH, CSG@IFI

p gpoint interval values

Hypothesis testing8

Page 23: Aims2011 SLA support system

SLACC Process

SLACC Decision Support System

Input E ti t A l i

Cloud Operator

pDesigner Estimate Analysis

How much a variable, e.g., “Time that the Load Balancing distributes the R”, influences the regression

How much of “tolerance” can be added to the regression (in order to

place an SLA parameter “offer”)

A l tt i th

© 2011 UZH, CSG@IFI

Analyze patterns in the measured/observed data that can

indicate any possible cause

8

Page 24: Aims2011 SLA support system

System Architecture

© 2011 UZH, CSG@IFI

CP: Cloud Provider

9

Page 25: Aims2011 SLA support system

Summary

Estimate SLA parameters in order to evaluate what Cloud Providers will be able to offer/accept asCloud Providers will be able to offer/accept as SLOs or KPIs– Analyzing historical data, current information about ITAnalyzing historical data, current information about IT

infrastructure, and considering possibly changes

SLACC, Decision Support System– It aims to be part of the system without interfering in the

current Cloud IT architecture– Work with typical Cloud Computing performance parameters

S i O i t d

© 2011 UZH, CSG@IFI

– Service-Oriented

10