Top Banner
Condor Team Member Computer Sciences Department University of Wisconsin-Madison [email protected] http://www.cs.wisc.edu/condor Dynamic DAGMan with ClassAds Himani Apte
24

Dynamic DAGMan with ClassAds

Jan 20, 2016

Download

Documents

daria

Himani Apte. Dynamic DAGMan with ClassAds. Outline. DAGMan workflow management Motivation for dynamic DAGMan ClassAds Putting together: DAGMan + ClassAds Looking ahead. DAGMan. Directed Acyclic Graph Manager Meta-scheduler for Condor DAG: set of jobs with dependencies - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dynamic DAGMan with ClassAds

Condor Team MemberComputer Sciences DepartmentUniversity of Wisconsin-Madison

[email protected]://www.cs.wisc.edu/condor

Dynamic DAGMan with ClassAds

Himani Apte

Page 2: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Outline

› DAGMan workflow management

› Motivation for dynamic DAGMan

› ClassAds

› Putting together: DAGMan + ClassAds

› Looking ahead

Page 3: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

DAGMan

› Directed Acyclic Graph Manager

› Meta-scheduler for Condor

› DAG: set of jobs with dependencies

› Manages submission of DAG jobs

› Enforces execution order

› DAGMan itself is a Condor job!

Page 4: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Example DAGJob A A.condor

Job B B.condor

Job C C.condor

Job D D.condor

Parent A Child B C

Parent B C Child D

Script PRE A input.sh

Script POST D output.sh

A

CB

D

Page 5: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Simplified state diagram of a DAG node

Waiting Pre-running Submitted Done

Post-running

Failed

Page 6: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

DAGMan: important properties

› Monitors job state using Condor logs

› Simple and clean recovery model• Rescue DAG: saves state at failure• Restart: reconstruct internal state

› Scripts allow “lazy” planning

› Throttling parameters

Page 7: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Outline

› DAGMan workflow management

› Motivation for dynamic DAGMan

› ClassAds

› Putting together: DAGMan + ClassAds

› Looking ahead

Page 8: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Motivation for dynamic DAGMan

› DAG: complete execution order

› Flexibility to make run-time decisions• Which subset of DAG nodes should execute?• When should node X execute?

› Conditional DAGs• Associate a condition with DAG edges• Simplest condition: successful completion of

parent nodes

Page 9: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Conditional DAG: examples

A

Condition:

A.x = = true

B C

Yes No

P1 P2

C

Condition:

P1.x OR P2.x

Example 1 Example 2

Page 10: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Motivation for dynamic DAGMan

› Scripts can be leveraged for lazy planning• For simple conditions

• E.g. exit value of job

• Modify DAG structure• E.g. convert branch-not-taken to no-op/empty

› We want a generic solution

› Supported by “Dynamic DAGMan”

Page 11: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Outline

› DAGMan workflow management

› Motivation for dynamic DAGMan

› ClassAds

› Putting together: DAGMan + ClassAds

› Looking ahead

Page 12: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

ClassAds

› Classified advertisements

› Used extensively in Condor• Define jobs, machines, resources• Define conditions, triggers,

requirements• Maintain internal state

Page 13: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

ClassAds

› List of attribute-value pairs• Simple value types: integer, strings• Complex types: list, expressions,

ClassAds

› Matchmaking framework• Tests match between two classAds• Using “Requirements” expression

› Great fit for Dynamic DAGMan

Page 14: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Outline

› DAGMan workflow management

› Motivation for dynamic DAGMan

› ClassAds

› Putting together: DAGMan + ClassAds

› Looking ahead

Page 15: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Putting together: DAGMan + ClassAds

› Dynamic DAGMan research project• Work-in-progress• Not yet available in Condor

› DAG nodes have associated classAds› Basic node attributes

• Job identifier, name, type• Status (Waiting, Submitted, Done, etc.)

Page 16: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Dynamic DAGMan: attributes

› Execution characteristics of job• Exit value• Wall-clock time • CPU utilization (local and remote)• Network statistics (bytes sent / received)• Information about files transferred (for vanilla

universe)

› Attributes maintained by Condor for a job

Page 17: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Dynamic DAGMan: conditions

› Requirements expression• Defines trigger condition for the node• Arbitrarily complex expression • Defined on the attributes of parent

nodes

› Use matchmaking to determine if a node can be submitted

Page 18: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Dynamic DAG: example

A

condition x = = true

B C

Yes No

Job A A.condor

Job B B.condor

Job C C.condor

Parent A Child B \

COND [ ( other.job == A &&

other.x == true ) ]

Parent A Child C \

COND [ ( other.job == A &&

other.x == false ) ]

Page 19: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Dynamic DAGMan: example

Job P1 P1.condor

Job P2 P2.condor

Job C C.condor

Parent P1 P2 Child C \

COND [ (other.job == P1 &&

other.x == true) ||

(other.job == P2 &&

other.x == true) ]

P1 P2

C

Condition:

P1.x OR P2.x

Page 20: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Dynamic DAGMan

› Recovery model is still the same• Rescue DAG: saves node state at failure• ClassAd attribute-values can be re-

generated from Condor logs

› Flexibility to make run-time decisions• Which subset of nodes in the DAG

should be executed?• When should node X be executed?

Page 21: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Outline

› DAGMan workflow management

› Motivation for dynamic DAGMan

› ClassAds

› Putting together: DAGMan + ClassAds

› Looking ahead

Page 22: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Looking ahead

› DAG with only implicit edges• Parent-child relations embedded in classAds• Nodes specify

• Trigger condition• Preference for child nodes to run

• On-the-fly dependency formation based on previous node execution

› DAGMan collaborates with Quill• Getting attributes from persistent storage

Page 23: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Looking ahead

› Allow job to modify/add its attributes• Determine what happens after job exits

› Global state control• Throttling expression/parameters

› Global DAG-classAd• Statistics on running, successful and failed

jobs• E.g. if (#failed jobs > N ) run cleanup node

Page 24: Dynamic DAGMan with ClassAds

www.cs.wisc.edu/condor

Thank-you

We are interested in knowing your suggestions!