Top Banner
An Integrated Framework for Parameter-based Optimization of Scientific Workflows Vijay S Kumar , P. Sadayappan G. Mehta, K. Vahi, V. Ratnakar, Jihie Kim, Ewa Deelman, Yolanda Gil 15 June 2009 HPDC 2009 1 Tahsin Kurc, Joel Saltz Mary Hall
24

An Integrated Framework for Parameter-based Optimization of Scientific Workflows

Jun 21, 2015

Download

Documents

vijayskumar

Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multi-dimensional parameter space. While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the output, others may dictate trading the output quality of individual components (and of the whole workflow) for performance. This paper describes an integrated framework which is capable of supporting performance optimizations along multiple dimensions of the parameter space. Using two real-world applications in the spatial data analysis domain, we present an experimental evaluation of the proposed framework.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 1

An Integrated Framework for Parameter-based Optimization

of Scientific Workflows

Vijay S Kumar,

P. Sadayappan

G. Mehta, K. Vahi, V. Ratnakar,

Jihie Kim, Ewa Deelman, Yolanda Gil

15 June 2009 HPDC 2009 1

Tahsin Kurc, Joel SaltzMary Hall

Page 2: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 2

Motivations• Performance of data analysis applications is

influenced by parameters– optimization search for optimal values in a

multi-dimensional parameter space

• A systematic approach to:– enable the tuning of performance parameters

(i.e., select optimal parameter values given an application execution context)

– support optimizations arising from performance-quality trade-offs

Page 3: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 3

Contributions of this paper• No auto-tuning yet (work in progress)

• Core framework that can– support workflow execution (with application-level QoS)

in distributed heterogeneous environments– enable manually tuning of parameters simultaneously– allow application developers and users to express

applications semantically– leverage semantic descriptions to achieve performance

optimizations• customized data-driven scheduling within Condor

Page 4: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 4

Application characteristics• Workflows: Directed Acyclic Graphs with well-

defined data flow dependencies– mix of sequential, pleasingly parallelizable and complex

parallel components– flexible execution in distributed environments

• Multidimensional data analysis– data partitioned into chunks for analysis– dataset elements bear spatial relationships, constraints– data has an inherent notion of quality applications

can trade accuracy of analysis output for performance

• End-user queries supplemented with application-level QoS requirements

Page 5: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 5

Application scenario 1: No quality trade-offs

• Minimize makespan while preserving highest output quality• Scale execution to handle terabyte-sized image data

Page 6: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 6

Application scenario 2: Trade quality for performance

• Support queries with application-level QoS requirements– “Minimize time to classify image regions with 60% accuracy”– “Maximize classification accuracy of overall image within 30 minutes”

Page 7: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 7

Performance optimization decisions

View each decision as a parameter that can be tuned

• What algorithm to use forthis component?

• What data-chunking strategy to adopt?

• Where to map each workflow component?

• Which components to merge into meta-components?

• What is the quality of input data to this component?

• What is the processing order of the chunks?

• Which components need toperform at lower accuracylevels?

Page 8: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 8

Conventional Approach

Application workflowApplication workflow

workflow design datasets

Workflow Description

Semantic representation• component discovery• workflow composition• workflow validation

Workflow Execution

• clusters, the Grid or SOA• task-based / services-based• batch mode / interactive

Page 9: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 9

Proposed approach: extensions

Application workflowApplication workflow

workflow design

datasets

Description module

Semantic representation• search for components• workflow composition• workflow validation• performance parameters

Execution module

Hierarchical execution:• map workflow components onto Grid sites• fine-grain dataflow execution ofcomponents on clusters

metadata

Analysis requests, queries with QoS:“Maximize accuracy within t time units”

Analysis requests, queries with QoS:“Maximize accuracy within t time units”

Trade-off module

• map high-level queries to low-level execution strategies• select appropriate values forperformance parameters

Page 10: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 10

An instance of our proposed framework

Description module

Execution module

Trade-off module

WINGS (Workflow INstance Generation and Selection)

Pegasus WMS

DataCutter

Condor, DAGMan

Interacts with the description and execution modules

PARAMETERS

Page 11: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 11

Description Module: WINGS(Workflow Instance Generation and Selection)

• Layered workflow refinement

• Workflow Template:– abstract description– dataset-independent– resource-independent

• Compact workflow Instance:– contains mappings to

actual datasets– resource-independent

• Expanded workflow instance

Conceptual workflow sketch

Workflow template

Workflow instance

to Execution module

(1)

(2)

(3)

Page 12: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 12

Extensions to WINGS data ontology

CollOfCollections

Collection

FilehasCreationMetadatahasFormatMetadatahasDescriptionFilehasContentTemplate

hasFileTypehasN_itemshasFiles

ChunkFilehasNXtiles, hasNYtiles,hasChunksizeX, hasChunksizeY,hasChunkIDX, hasChunkIDY,hasChunkIDZ, hasOverlap,

StackFile

SliceFilehasStartZ, hasEndZ

hasSliceIDZ, hasNXChunks,hasNYChunks

ChunkProjectedChunkNormalizedChunkStitchedChunk

StackSliceProjectedSliceNormalizedSlice

• Relations between entities, constraints on metadata

• Automatic description, naming of intermediate data products

Extensions for multidimensional data

analysis

Extensions for multidimensional data

analysis

“Core” data ontology“Core” data ontologyApplication-

specificApplication-

specific

Page 13: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 13

Execution ModulePegasus WMS (http://pegasus.isi.edu)

• Coarse-grain mapping of workflow tasks onto Grid sites• Submits sub-workflows to DAG schedulers at each site• Automatic data transfer between sites (via GridFTP)

DataCutter (http://datacutter.osu.edu)

• Fine-grain mapping of components onto clusters• Filter-stream model, asynchronous delivery• Each filter executes as a thread (could be C++/Java/Python)• Pipelined dataflow execution: Combined task- and data- parallelism• MPI-based version (http://bmi.osu.edu/~rutt/dcmpi)

Condor (www.cs.wisc.edu/condor)can now execute DataCutter jobs within its “parallel universe”

Page 14: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 14

Quality-preserving parameters

Partition

A

Image1

Chunks

Partition

Image1

C1 C2 C3 Cn…

…A1 A2 A3 An

Data Chunking strategy [W, H ]

• algorithmic variant of a component• component placement• grouping components into meta-components• task-parallelism and data streaming within meta-component

… …

Page 15: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 15

Quality-trading Parameters• Data approximation

– e.g. spatial resolution of chunk– higher resolutions greater execution times, but does not imply higher accuracy of output

• Processing order of chunks– the order in which data chunks are operated

upon by a component collection– can process “favorable” chunks ahead of other

chunks

Page 16: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 16

Processing order

• Tasks within a component collection treated as a batch

– Condor: executes them in FIFO order• Implemented a priority-queue based heuristic for

reordering task execution for a component collection– “favorable” chunks are processed ahead of other chunks– different QoS requirements change the insertion scheme

• Can the execution of the bag-of-tasks be reordered dynamically?

– condor_prio alone is not suitable

processing order

A …A1 A2 A3 An

Page 17: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 17

Customized scheduling in Condor

• Customized job scheduling within Condor to support performance-quality trade-offs for application-level quality-of-service (QoS)– implements the priority queue scheme (overrides the FIFO scheme)– executes within Condor’s “scheduler” universe

• Associates tasks with the spatial coordinates of the respective chunks that are being processed– uses the automated naming of data products (metadata

propagation) brought about by semantic descriptions

processing order

A …A1 A2 A3 An

PQ

Page 18: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 18

Experimental setup: Test bed• RII-MEMORY

– 64 node Linux cluster– Dual-processor 2.4 GHz Opteron nodes– 8GB RAM, 437 GB local RAID0 volume– Gigabit Ethernet

• RII-COMPUTE– 32 node Linux cluster– 3.6 GHz Intel Xeon processors– 2GB RAM, 10 GB local disk– Gigabit Ethernet and Infiniband

• Wide-area 10 Gbps connection

Page 19: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 19

Performance Evaluation• Focus on performance-quality trade-offs

• Neuroblastoma Classification workflow: – “Maximize overall confidence of classification within

t time units”– “Maximize number of data chunks processed within t

time units”

• How to tune quality-trading parameters to achieve high performance?– Data resolution– Processing order of chunks

Page 20: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 20

Parameters: resolution, processing order

Custom scheduling in Condor helpstrade quality for performance better than default scheduling

for QoS requirement type 1

• 32 nodes, 21 GB image, confidence threshold = 0.25

• “Maximize overall classification confidence within time t units”

Page 21: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 21

Parameters: resolution, processing order

Custom scheduling in Condor helpstrade quality for performance better

than default scheduling for QoS requirement type 1

• 32 nodes, 21 GB image, confidence threshold = 0.25

• “Maximize data chunks processed within t time units”

Custom scheduling in Condor can improve throughput for QoS requirement type 2

Page 22: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 22

Conclusions• Performance optimization for workflows: search

for values in a multidimensional parameter space• Instance of our proposed framework allows

users to manually express values for many performance parameters (simultaneously):– quality-preserving & quality-trading

• Semantic representations of domain data and performance parameters can be leveraged– Data chunking strategy and data approximation can

help restructure workflow for a given resource configuration

– Customized job scheduling within Condor can scalably support application-level QoS

Page 23: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 23

Current and Future work• Use semantic representations to map high-

level queries onto low-level execution strategies

• Techniques to efficiently navigate the parameter space– Assume high data cardinality Uniformity of

application context over time– Use information from sample runs to build

statistical models

Page 24: An Integrated Framework for Parameter-based Optimization of Scientific Workflows

15 June 2009 HPDC 2009 24

An Integrated Framework for Parameter-based Optimization of

Scientific Workflows

Thanks!

Vijay S Kumar ([email protected]), P. SadayappanTahsin Kurc, Joel Saltz (www.cci.emory.edu)

Ewa Deelman, Yolanda Gil (www.isi.edu)