Top Banner
Building a Real Workflow Thursday morning, 9:00 am Lauren Michael <[email protected]> Research Computing Facilitator University of Wisconsin - Madison
28

Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

Dec 28, 2015

Download

Documents

Kelly Hines
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

Building a Real WorkflowThursday morning, 9:00 am

Lauren Michael <[email protected]>

Research Computing Facilitator

University of Wisconsin - Madison

Page 2: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

• non-computing “workflows” are all around you … especially in science grading exams instrument setup experimental procedures

• when planned/documented, workflows help with: organizing and managing processes saving time with automation objectivity, reliability, and reproducibility

(THE TENENTS OF GOOD SCIENCE!)

2

Page 3: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Workflows are like Computing Algorithms

3

•Steps

•Connections

•(Metadata)

Page 4: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

‘Engineering’ a Good Workflow

1. Draw out the general workflow

2. Define details (test ‘pieces’ with HTCondor jobs) divide or consolidate ‘pieces’ off-load file transfers and consider file transfer times identify steps to be automated or checked

3. Build it piece-by-piece; test and optimize

4. Scale-up: data and computing resources

5. What more can you automate or error-check?

(And remember to document)

4

Page 5: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

From schematics…

5

Page 6: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

… to the real world

6

Page 7: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Start with This

7

file preparation(minutes)

processing(days)

transform results(minutes)

Page 8: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

process ‘99’(filter output)

Parallelize with HTC Splitting

8

file prep and split

process ‘0’(filter output)

combine, transform results

. . .

Page 9: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

process ‘99’(filter output)

End Up with This

9

(special transfer)

file prep and split(POST-RETRY)

process ‘0’(filter output)

combine, transform results

(POST-RETRY)

. . .

1 GB RAM 2 GB Disk 1.5 hours

100 MB RAM500 MB Disk 40 min(each)

300 MB RAM 1 GB Disk 15 min

(PRE)(POST-RETRY)(POST-RETRY)

Page 10: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Key HTC Principles

10

1. Increase Throughput

2. Be Kind to Your Submit Node

3. Bring it With You

4. ‘Scriptify’ As Much As Possible

5. “Testing, testing, 1, 2, 3 …”

Page 11: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Always focus on Throughput

11

What is High Throughput many ‘smaller’ jobs persistent job pressure automation optimizing total workflow times

What is not? job runtimes less than 5 min micro-optimizations

Page 12: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

‘Engineering’ a Good Workflow

1. Draw out the general workflow

2. Define details (test ‘pieces’ with HTCondor jobs) divide or consolidate ‘pieces’ off-load file transfers and consider file transfer times identify steps to be automated or checked

3. Build it piece-by-piece; test and optimize

4. Scale-up: data and computing resources

5. What more can you automate or error-check?

(And remember to document)

12

Page 13: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Resources Jobs Need

• CPU #CPUs and time

• RAM• Disk

Working (execute side) Total (submit side) Compute bandwidth (file transfer)

• Network bandwidth Usually for file transfer only

13

Page 14: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

First run jobs locally:To measure usage

• Did it run correctly? Are you sure?

• Run once remotely (on execute machine, not submit machine)!

• Once working, run a couple of times• If big variance in resource needs,

should you take the… Average? Median? Worst case?

14

Page 15: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

User Log shows all

005 (2576205.000.000) 06/07 14:12:55 Job terminated.

(1) Normal termination (return value 0)

Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage

Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage

Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage

Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage

5 - Run Bytes Sent By Job

104857640 - Run Bytes Received By Job

5 - Total Bytes Sent By Job

104857640 - Total Bytes Received By Job

Partitionable Resources : Usage Request

Cpus : 1

Disk (KB) : 122358 125000

Memory (MB) : 30 100

15

Page 16: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Golden Rules for DAGs

16

• Beware of the shish kebab! (self-checkpointing, next lecture)

• Use PRE and POST script generously

• RETRY is your friend• DAGs of DAGs are good

SPLICE SUB_DAG_EXTERNAL

Page 17: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Wrapper Scripts are Essential

17

• Before execution (bring it with you!) transfer/prepare files and directories setup/configure environment and other

dependenciesincluding run-time libraries (Matlab, R, Python, etc.)

• Execution prepare complex command-line arguments batch together many ‘small’ tasks

• After execution filter, divide, consolidate, and/or compress files check for errors

Page 18: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Extra DAG tips

18

• PRE and POST scripts run on the submit node avoid combining/splitting large files avoid compiling

• Remember: DAGs don’t do loops well

Solution: move more tasks into a ‘job’

Page 19: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Automate All The Things

• well, not really, but kind of …• Really: What is the minimal number of manual

steps necessary?even 1 might be too many; zero is perfect!

• Consider what you get out of automationtime savings (including less ‘babysitting’ time)reliability and reproducibility

19

Page 20: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Is It Worth the Time?

20http://xkcd.com/1205/

Page 21: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Batching (Merging) is easy

• Scripting Avoids transfer of intermediate files Debugging can be a bit tricky without

scripted error reporting

21

Page 22: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Breaking up is hard to do…

• Ideally into parallel (separate) jobs reduced job requirements = more matches not always possible

• Often need checkpoints standard universe can help user-defined check-pointing checkpoint images can be hard to manage

22

Page 23: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Automation: Parameter Sweeps

Command arguments can become complicated and messy.

•Wrapper scripts could: Hardcode “extra” arguments Compute arguments Look up arguments from a table

23

Page 24: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

‘Engineering’ a Good Workflow

1. Draw out the general workflow

2. Define details (test ‘pieces’ with HTCondor jobs) divide or consolidate ‘pieces’ off-load file transfers and consider file transfer times identify steps to be automated or checked

3. Build it piece-by-piece; test and optimize

4. Scale-up: data and computing resources

5. What more can you automate or error-check?

(And remember to document)

24

Page 25: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Start with This

25

file preparation(minutes)

processing(days)

transform results(minutes)

Page 26: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

process ‘99’(filter output)

Exercise 1

26

file prep and split

process ‘0’(filter output)

combine, transform results

. . .

Page 27: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

process ‘99’(filter output)

Exercise 2

27

(special transfer)

file prep and split(POST-RETRY)

process ‘0’(filter output)

combine, transform results

(POST-RETRY)

. . .

1 GB RAM 2 GB Disk 1.5 hours

100 MB RAM500 MB Disk 40 min(each)

300 MB RAM 1 GB Disk 15 min

(PRE)(POST-RETRY)(POST-RETRY)

Page 28: Building a Real Workflow Thursday morning, 9:00 am Lauren Michael Research Computing Facilitator University of Wisconsin - Madison.

2013 OSG User School

Questions?

• Feel free to contact me: [email protected]

• Now: “Joe’s Workflow” Exercise 6.1 9:30-10am, in groups

• Later: 10-10:30am: From Workflow to Production 10:30-10:45am: Break 10:45am-12:15: Exercises 6.2, 6.3

28