General and Effective Monetary Optimizations for Workflows in IaaS Clouds

Post on 24-Feb-2016

78 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

General and Effective Monetary Optimizations for Workflows in IaaS Clouds. presented by. Amelie Chi Zhou amelie.czhou@gmail.com Xtra Computing Group http:// pdcc.ntu.edu.sg/xtra Nanyang Technological University, Singapore. Workflows for Scientific Applications . - PowerPoint PPT Presentation

Transcript

1

General and Effective Monetary Optimizations for Workflows in IaaS Clouds

Amelie Chi Zhou amelie.czhou@gmail.com

Xtra Computing Grouphttp://pdcc.ntu.edu.sg/xtraNanyang Technological University, Singapore

presented by

2

Workflows for Scientific Applications • Workflows are structured

– Tasks have very different I/O and computational behavior. • Real-world workflows

– Montage, Ligo, Epigenomics, water-simulation

• Workflow ensembles [Malawski et al., SC’12]– Composition of workflows with similar structures and different

parameters and priorities

Montage Ligo Epigenomics

Saven
so, we need to have instance selection.

3

Running Workflows on IaaS Clouds

• Define IaaS clouds– Provide fundamental computing resources for users to provision– Examples: Amazon EC2, Rackspace, OpenStack, Google

Compute Engine …• Example projects

– Montage, Broadband, Epigenomics on Amazon EC2 [Juve et al., eScience’09]

– Astronomy applications on Nimbus, Eucalyptus, and EC2 [Vöckler et al., ScienceCloud’11]

– …

4

Workflows in IaaS Clouds

• Features of IaaS clouds– Pay as you go (e.g., hourly pricing scheme)– Rich and evolving cloud offerings

• Research problems– Monetary cost optimizations– Performance optimizations– Elasticity– Fault tolerance – … Are the current solutions ideal/sufficient?

5

Monetary Cost Opportunities

• Instance types– Amazon EC2 provides 29 types of instances

• Instance reuse– Hourly charging scheme

• Pricing schemes– On-demand, spot and reserved pricing

V.S.• Tasks can have very different I/O and computational behavior.• Workflows have different deadline and monetary constraints.• Users may have various workflow application scenarios.

Saven
we need to define why the problem requires huge research effort. we show them the complexity.

6

Current Solutions are Far From Ideal• Problems of current approaches

– Auto-scaling [Mao et al., SC’11] resource management• More effective optimizations 29%

less cost– Assume static cloud performance and

pricing• Cloud dynamics + spot instances

73% less cost– Heuristic-based cost and performance

optimizations are specific.• They are likely to be suboptimal in

evolving and diversified workflow applications.

29%

73%

Saven
Use Figures!!!Not just text.

7

Our Research Efforts

• Effectiveness– Dyna: Minimize the monetary cost of workflows, addressing both

the price and performance dynamics in clouds

• Generality– ToF: Define transformation operations to model common cost

and performance optimizations– Deco: Design a declarative language called WLog to specify

various workflow optimization problems

The focus of this presentation.

8

Overall Design• We design general workflow optimization frameworks to fully

explore the optimization opportunities that lie in workflows

Wlog programs

Transformation-based Optimizer

Problem specification layer

Optimization layer

Execution layer

Deco

ToF

9

Outline

• Related Work• Generalized Optimization Frameworks

– General transformations for cost and performance optimizations– A declarative language for workflow optimization problems

• Conclusions

Saven
use a more meaningful tile.change all other slides.

10

Related Work

• Performance and monetary cost optimization heuristics– Auto-scaling [Mao et al., SC’11]

• Fixed sequence of workflow optimizations– Workflow scheduling with performance and cost constraints

[Kllapi et al., SIGMOD’11]• Consider only one on-demand instance type

The heuristics are specifically designed for specific optimization problems and the optimization opportunities are not fully explored.

11

Related Work (cont’d)

• Generalized optimization frameworks: overhead is a problem– Generalized bin-ball abstraction for resource allocation [Rai et

al., SoCC’12]• GPU acceleration• Not always convenient to model a problem with the bin-ball model

– Declarative language to model a wide range of COPs [Liu et al., VLDB’12]

• Distributed systems• Ignorant to the special features and optimization opportunities in

workflows

There is no general optimization framework for workflows.

12

Outline

• Related Work• Generalized Optimization Frameworks

– General transformations for cost and performance optimizations

– A declarative language for workflow optimization problems• Conclusions

13

ToF: A Transformation-based Optimization Framework

• Outline– Main contributions of this work– System overview– Design details– Evaluation results

14

Main Contributions

• This study has two major contributions– We define a series of common transformations for the

performance and cost optimizations of workflows.– We design a light-weight optimizer to guide the

transformation process.

15

Workflow Transformation

• Definitions– Instance assignment graph

• Each node represents instance configuration for a task.• Same structure as the workflow DAG

– Transformation operation• Structural change in the instance assignment graph

0

1 3

Transformations

0

1,22

3

0

2,31

0

1,32

0

1,2,3

16

System Overview

• Design ideas– Two types of transformations

• Main schemes: reduce cost• Auxiliary schemes: help main

schemes to reduce cost– Use cost model to guide the

transformation optimization– Periodical batch optimization

• Maximize instance sharing and reuse

• Reduce optimizer overhead

Main Schemes

AuxiliarySchemes

Termination?

Output

Cost model

No

Yes

Optimization process in one plan period

17

Design Details

• Transformation operations– Main schemes: Merge,

Demote– Auxiliary schemes: Move,

Promote, Split, Co-scheduling

– Transformations can combine with each other

Saven
you need to introduce each operation in details.

18

Using Transformations

• Example of using Move and Merge operations

Charging hours:

Only transform shape

Reduces cost

19

Experimental Setup

• Workload– Montage, Ligo and Mixed – Workflow submission ratefollows Poisson distribution

• Comparisons– ToF – Baseline: only implement the initial instance configuration– Auto-scaling [Mao et al., SC’11]– Greedy: randomly select the transformation during

optimization• All results are normalized to Baseline

20

Evaluation Results on Cost Optimizations

Optimization results under the pricing scheme of Amazon EC2.ToF obtains the lowest monetary cost on all workflows.• Over Auto-scaling by 29%• Over Baseline by 27%• Over Greedy by 17%

29%17%

21%16%

28%15%

21

12%

Evaluation Results on Performance Optimizations

Performance optimization results.ToF obtains the lowest average execution time on all workflows.• Over Auto-scaling by 21%• Over Baseline by 21%• Over Greedy by 18%

21%18%

21%8%

16%

22

Outline

• Related Work• Generalized Optimization Frameworks

– General transformations for cost and performance optimizations– A declarative language for workflow optimization problems

• Conclusions

Saven
use a more meaningful tile.change all other slides.

23

Deco: A Declarative Optimization Framework

• Outline– Main contributions of this work– System overview– A declarative language for workflows– GPU-accelerated search engine– Evaluation results

24

Main Contributions

• This work has three main contributions– A declarative language for resource provisioning of scientific

workflows in IaaS clouds– A generalized optimization framework to serve a wide range of

optimization problems– Fast GPU-based implementation for low optimization overhead

25

Motivating Ideas

• Why declarative language?– Declarative languages like HTML, SQL, Prolog– Concise and clear– Focus on what to do rather than how to do it

• Why GPU acceleration?– Generic search has large runtime overhead– Monte Carlo method is used for probabilistic approximation

[Raedt et al. 2007] which is suitable for GPU acceleration

26

System Overview

• Overview of the Deco system– WLog, a declarative language for workflows– GPU-Accelerated search engine

27

WLog – A Declarative Language for Workflows• WLog is designed based on Prolog • A WLog program describing a workflow scheduling problem

goal minimize Ct in totalcost(Ct).cons deadline(95%, 10h).var configs(Tid, Vid) forall task(Tid) and Vm(Vid).

r1 import(amazonec2).r2 import(montage).r3 path(X,Y,Y,C) :- edge(X,Y), exetime(X,Vid,T), C is T.r4 path(X,Y,Z,C) :- edge(X,Z), Zn==Y, path(Z,Y,Z2,C1), exetime(X,Vid,T), C is T+C1.r5 maxtime(Path,T) :- setof([Z,C],path(root,tail,Z,C),Set), max(Set,[Path,T]).r6 cost(Tid,Vid,C) :- price(Vid,Up), exetime(Tid,Vid,T), C is ceil(T/60.0)*Up.r7 totalcost(Ct) :- findall(C,cost(Tid,Vid,C),Bag), sum(Bag,Ct).

problem specific keywords:• goal Optimization goal defined by the user.• cons Problem constraint defined by the user.• var Problem variable to be optimized.

deadline(P, D) A probabilistic deadline requirement that D is at the P-th percentile of workflow execution time.

import(cloud) Import the cloud-related facts from the cloud metadata.

import(daxfile) Import the workflow-related facts generated from a DAX file.

28

GPU Accelerations

• Explore vs. exploit– By exploit, partial results are prioritized.– Exploration traverses the search tree level by level which offers

GPU a opportunity to parallel the searching process.• Memory optimizations

– Minimize the usage of global memory– Reduce accesses to shared memory

29

Evaluation Settings

• Three use cases– Workflow scheduling problem– Workflow ensemble [Malawski et al., SC’12]

• Goal: execute more workflows with high priorities within given budget and deadline

– Follow-the-cost: multiple workflows, multiple datacenters• Comparison for workflow ensemble problem

– Algorithms: Deco vs. SPSS [Malawski et al., SC’12]– Ensemble types: constant, Uniform(Un)sorted, Pareto(Un)sorted– Generate 5 budgets between [MinBudget, MaxBudget]

• All results are normalized to that of SPSS

30

Evaluation Results

• Under all ensemble types and budget constraints– Deco obtains better score metric value than SPSS

Obtained score results of SPSS and Deco with different ensemble types under budget 1 to 5 and fixed deadline. Workflow type is Ligo.

31

Evaluation Results (cont’d)

• Programmability of WLog in Deco (lines of codes)– Users (re-)implement the workflow application in C++.– With Deco, users implement in WLog.

Use Case C++ Implementation

WLog

Workflow Scheduling 1950 10

Workflow Ensemble 1960 13

Follow-the-Cost 2230 15

Deco allows much lower coding complexity than manual implementation.

32

Performance Speedup of GPUs

Montage Epigenomics Ligo0

100

200

300

400

500

GPU Accelerations

• Performance speedup of GPU implementation over CPU implementation on a single core for the three applications

437x

93x31x

Saven
give figures.

33

Outline

• Related Work• Generalized Optimization Frameworks

– General transformations for cost and performance optimizations– A declarative language for workflow optimization problems

• Conclusions

34

Conclusions

• IaaS clouds have become an attractive platform for hosting workflows.

• Despite recent efforts in monetary cost optimizations of workflows in the cloud, there is still a large room for further improvements.

• Due to the complex cloud offerings and problem specifications, we develop general optimization frameworks.

– ToF achieves up to 29% improvement over the state-of-the-art algorithm.

– Deco achieves up to 77% improvement over the state-of-the-art algorithm.

35

Future Work

• Energy-efficient Cloud– Reduce the investment cost of cloud provider to potentially

reduce instance price with energy-efficient hardware/software

• Optimization opportunities in Multi-Cloud– Utilize different cloud offerings, e.g., instance types, to further

reduce cost

36

References• Maciej Malawski, Gideon Juve, Ewa Deelman, and Jarek Nabrzyski. 2012. Cost- and deadline-

constrained provisioning for scientific workflow ensembles in IaaS clouds. SC '12. 11 pages.• Juve, G.; Deelman, E.; Vahi, K.; Mehta, G.; Berriman, B.; Berman, B.P.; Maechling, P., "Scientific workflow

applications on Amazon EC2," E-Science Workshops, pp.59,66, 9-11 Dec. 2009.• Jens-Sönke Vöckler, Gideon Juve, Ewa Deelman, Mats Rynge, and Bruce Berriman. 2011. Experiences

using cloud computing for a scientific workflow application. ScienceCloud '11. P15-P24. 2011.• Ming Mao, Marty Humphrey: Auto-scaling to minimize cost and meet application deadlines in cloud

workflows. SC 2011: 49.• Herald Kllapi, Eva Sitaridi, Manolis M. Tsangaris, and Yannis Ioannidis. 2011. Schedule optimization for

data processing flows on the cloud. SIGMOD '11. 289-300.• Anshul Rai, Ranjita Bhagwan, and Saikat Guha. 2012. Generalized resource allocation for the cloud.

SoCC '12. Article 15 , 12 pages.• Changbin Liu, Lu Ren, Boon Thau Loo, Yun Mao, and Prithwish Basu. 2012. Cologne: a declarative

distributed constraint optimization platform. Proc. VLDB Endow. 5, 8 752-763.• L. De Raedt, A. Kimmig, and H. Toivonen, ProbLog: A probabilistic Prolog and its application in link

discovery, IJCAI 2007, pages 2462-2467, 2007.• Amelie Chi Zhou, Bingsheng He, Transformation-based Monetary Cost Optimizations for Workflows in the

Cloud, accepted by TCC, Dec 2013. • Amelie Chi Zhou, Bingsheng He, A declarative optimization framework for workflows in IaaS clouds,

submitted to SC 2014.• Amelie Chi Zhou, Bingsheng He, Cheng Liu, Monetary Cost Optimizations for Hosting Workflow-as-a-

Service in IaaS Clouds, submitted to ToC, 2014.

37

Thank you!Amelie Chi Zhouamelie.czhou@gmail.com

Advisor: Bingsheng Hebshe@ntu.edu.sg

Xtra Computing Grouphttp://pdcc.ntu.edu.sg/xtraNanyang Technological University, Singapore

top related