A Workflow Engine with Multi-Level Parallelism Supports

Post on 15-Jan-2016

34 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A Workflow Engine with Multi-Level Parallelism Supports. Qifeng Huang and Yan Huang School of Computer Science Cardiff University 200 5.9. Agenda. Background SWFL Workflow Architecture SWFL Description Language SWFL Workflow Engine Multi-level Parallelisms in SWFL. - PowerPoint PPT Presentation

Transcript

A Workflow Engine with Multi-Level Parallelism Supports

Qifeng Huang and Yan Huang

School of Computer ScienceCardiff University

2005.9

Agenda

• Background

• SWFL Workflow Architecture

• SWFL Description Language

• SWFL Workflow Engine

• Multi-level Parallelisms in SWFL

Background: Service and Service Composition

• Service encapsulates various resources and make them available over the network via standard interface and protocol

• Web/grid services are emerging as important paradigms for distributed computing

• Service composition/workflow: complex application can created by simple services

Background: GSiB

• Current efforts such as BPEL mainly focus on business process

• Increased demands for scientific workflow, as parallel computing especially grid computing applications expands

• GSiB aims to a general workflow for both business and scientific areas, especially for the latter

• The convergence trend of grid services and web services make it feasible

GSiB Workflow Architecture

VSCE

Service Workflow Language

SWFL Workflow Engine

• SWFL: an XML-based, graph-oriented service workflow description language

• Engine: Distributed enactment environment with multi-level parallelism support

• VSCE: Visual Service Composition Environment

SWFL: Basic Elements

Types*

FlowModel(name, isParallel, …)

Message* (name, part* …)

Variables* (name, type)

Activity* Definition of all involved activities (normal/native services, assign, if, switch, for, while, do while and catchEnd activities)

FlowModel* (name, isParallel …)

ControlLink* (Source/Port, Target/Port)

DataLink* (Source/Part, Target/Part)

SWFL: Graph-Oriented

• In GSiB, a workflow application can be described either as a validated XML (SWFL) documentation or a directed graph

• A node (activity in SWFL) could be either a standard service operation, an compound structure, or an on-machine program

• An edge (data/control link in SWFL) describes the data and control dependencies among involved activities

SWFL: An Example

Data Source

Activity A

IF(a/b)

Activity B

Activity C

Data Sink

a>b

……<swfl:flow name="sample" requireParallel="false"> <wsdl:input message="flowInput"/> <wsdl:output message="flowOutput"/> <swfl:activity> <swfl:if name="ifControl">…</swfl:if> </swfl:activity> <swfl:activity> <swfl:normal name="ActivityA"> <swfl:performedBy>… </swfl:performedBy> </swfl:normal> </swfl:activity>…… <swfl:controlLink> <swfl:source name="ifControl" port="IF"/> <swfl:target name="task2"/> </swfl:controlLink> …… <swfl:dataLink target="ifControl"> <swfl:source name="ActivityA"> <swfl:map>…</swfl:map> </swfl:source> </swfl:dataLink> ……</swfl:flow>……

SWFL vs. BPEL

• Both can be used to build workflows which involve peer-to-peer interactions between web services

• BPEL is mainly for business processes while SWFL is mainly for scientific areas

• BPEL uses a script-oriented approach, while SWFL follows a graph-oriented approach

SWFL: Why Graph-Oriented?

• Easy to use, especially using friendly VSCE: Like flow chart and UML model

• Flexible and dynamic in services schedule and execution– Completely decided by the engine– Making full use of dynamic runtime features,

different strategies can be used for a flow– Straightforward support to multi-level

parallelisms

VSCE: Make Complicated Things Easy

Workflow Drawing

Pane

VSCE: What is more…

• Friendly integrated visual tool for users to build, execute and control workflow– Make end users not have to know much

about workflow

• Design (draw) a flow with fun: Drag-and-drop

• Configure and initiate the execution

• Retrieve results and track runtime status

A Grid Architecture Based on workflow Engines (1)

<invoke name="registerAuctionResults"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationPT"operation="process"inputVariable="auctionData"><correlations><correlation set="auctionIdentification"/></correlations></invoke><receive name="receiveAuctionRegistrationInformation"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationAnswerPT"operation="answer"variable="auctionAnswerData"><correlations><correlation set="auctionIdentification"/></correlations></receive>

Job ProcessorJob ProcessorJob Processor

Workflow Engine

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Job Processor

SWFL

BPEL

A Grid Architecture Based on workflow Engines (2)

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Workflow Engine

Job ProcessorJob ProcessorJob ProcessorJob Processor

Job ProcessorJob ProcessorJob Processor

Workflow Engine Job ProcessorJob ProcessorJob ProcessorJob Processor

Job ProcessorJob ProcessorJob Processor

Workflow Engine

Job ProcessorJob ProcessorJob ProcessorJob Processor

ServiceServiceServiceServiceServiceService

<invoke name="registerAuctionResults"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationPT"operation="process"inputVariable="auctionData"><correlations><correlation set="auctionIdentification"/></correlations></invoke><receive name="receiveAuctionRegistrationInformation"partnerLink="auctionRegistrationService"portType="as:auctionRegistrationAnswerPT"operation="answer"variable="auctionAnswerData"><correlations><correlation set="auctionIdentification"/></correlations></receive>

SWFL

BPEL

GSiB Workflow Processing

SWFL/MPFL Document

Java Programs

XML2Graph

Graph2Java

Enactment Environment

ExecutionResult

1

2

3

Graph2XML

GSiB Instance: Graph Objects

• XML2Graph and Graph2Java tools

• Graph Objects– Two kinds: data graphs and control graphs– Straightforward format for VSCE– Schedule strategy is decided during

runtime

Engine: Architecture

Gateway

Job Processor

Storage

Scheduler

UDDI

VSCE

Engine

Engine: Components

• Gateway: a web service provides entry point to submit jobs and retrieval results and runtime status: three job formats

• Job Processor: computing resources composed of a pool of worker threads

• Scheduler: provides dynamic service execution strategy during runtime

• Storage: provides space as well as API for objects, results and status information

Engine: Multi-Level Parallelisms

• Service-level

• Flow-Level

• Message-Passing

• Parallelism in BPEL: explicitly described in the script

Service-Level Parallelism

• An activity is ready when all its input data are ready and all activities it has control dependencies are complete

• May exist several ready activities at the same time; Can be executed in parallel

• Greedy algorithm: execute an activity once it is ready; may waste storage and computing resource; not always optimum

• Question: how to schedule services?

Flow-Level Parallelism: An Example

A

B C

D E

F

A

BC

DE

F

Partition

Process 1

Process 2

Flow-Level Parallelism (2)

• Decentralized orchestration of services: divide a workflow into several sub-flows, to run by several job processors in parallel

• Two kinds: independent connected graphs; partition connected graph

• Parallelism achievements: quick response; high throughput; scalability

• Additional complexities: flow partition; coordination of distributed execution

Message-Passing Parallelism: Background and MPFL

• Parallelism in SWFL is suitable for applications with forms of parallelism that can be displayed in a workflow graph

• Most scientific applications exhibit more sophisticated parallelism like message passing, which is a normal thing

• MPFL: extends the SWFL flow model to support applications with message-passing

Message-Passing Parallelism: An Example

A

D

B

C

Flow Model 1for process 0

Flow Model 2for processes with rank larger than 0

A

D

B

C

B

C

B

C

B

C

Process 0 Process 1 Process 2 Process 3 Process N-1

Workflow Applicatio

n

MPFL Model

• Multi-layer heterogeneous communication domains are supported

• An instance is usually run on a cluster: parallelism just like a standard MPI program can be achieved

• Engine: accumulative extension of SWFL engine; still a work in progress

Job ProcessorJob ProcessorJob Processor

Workflow Engine

ServiceServiceServiceServiceServiceService

Job ProcessorJob ProcessorJob Processor

Cluster

MPFL

Message-Passing Parallelism

Conclusions

• Workflow framework in GSiB is grid-oriented, suitable for both business and scientific applications composed of web/grid services

• Graph-based SWFL provides much flexibilities for both end users and engine implementation

• VSCE provides visual tool to build and execute workflow applications

• SWFL engine provides an automatic and self-organizing enactment environment for the processing of workflow applications

• Better performance is achieved with the support of multi-level parallelism in SWFL engine

top related