Top Banner
HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013
14

HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Jan 18, 2018

Download

Documents

Noah Parsons

SubmitR  Move files, run job on remote system, view results Hub
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

HUBbub 2013:Developing hub tools that submit

HPC jobs

Rob CampbellPurdue University

Thursday, September 5, 2013

Page 2: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Example “SubmitR” tool running on the DiaGrid hub

• DiaGrid: distributed research computing network

• SubmitR: hub tool for running R scripts on DiaGrid

Page 3: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

SubmitR Move files, run job on remote system, view results

Hub

Page 4: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Building a job:Files, options/arguments, job parameters

Job Types

One process

Multiple processes, communicating

(parameter sweep)independent processes

Page 5: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

The “submit” command: Runs user command on a remote system

submit

1.Connect to remote system

2.Transfer input files and program

3.Create script for user’s command

4.Talk to batch or workflow system

5.Output periodic status updates

6.Transfer files back to hub

Page 6: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

For SubmitR, submit uses:•PBS job scheduling on Purdue’s Hansen cluster (single or parallel jobs) •Pegasus workflow management with HTCondor (parameter sweeps)

submit options:• VENUES - remote systems• MANAGERS - commands that can be run on remote systems

Page 7: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Building the submit command:

submit -n 2 -w 60 -v hansen -M -i inp.dat R-2.15.1 CMD BATCH -q “--args inp.dat” myscipt.R

Use manager “R-2.15.1”. Causes “R” interpreter to run on remote system.

Job should use 2 processors, 60 minutes walltime, run on Hansen cluster, and collect metrics. File “inp.dat” should be included

(transported to remote system).

Options for the R interpreter. Note: submit detects that “myscript.R” is used and transports it to remote system.

Page 8: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Executing the submit command, getting status updates:

Page 9: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Tips for using submit:

Test submit from the hub’s command line (workspace):

$> submit -n 1 -w 5 -v hansen -M R-2.15.1 CMD BATCH -q "--args 1 2" testargs.R" =SUBMIT-METRICS=> job=1214144 (5073894) Job Submitted at hansen-a Mon Sep 2 17:38:54 2013 (5073894) Simulation Queued at hansen-a Mon Sep 2 17:39:04 2013 (5073894) Simulation Complete at hansen-a Mon Sep 2 17:39:20 2013 (5073894) Simulation Done at hansen-a Mon Sep 2 17:39:30 2013 =SUBMIT-METRICS=> job=1214144 venue=1:sshPBS:5073894:[email protected] status=0 cpu=3.290000 real=3.000000 wait=14.000000 (end of output)

Use submit’s email notification feature to alert user when job finishes:

$> submit mail2self –s ‘Hey’ –t ‘Your job is done.’

Page 10: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Additional submit feature: Automatic breakout of parameter combinations (for sweeps)

“ submit … -p @@p1=1-3;@@p2=7,9 … ”

User wants six runs.

Parameters:•1 7•1 9•2 7•2 9•3 7•3 9

Page 11: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Directories: “Run” directory:•A tool-specific directory under hub’s session directory.•Current working directory for executing submit .•Isolates job-related files.•Ex. “~/data/sessions/6716/submitr”

Parameter sweep output:•Job directory created under run directory.•Pegasus puts each run’s (sub-job’s) output in separate directory under job directory.•Pegasus bookkeeping files in job directory.

Page 12: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Exiting the tool, canceling the job:

Page 13: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Moving files:

1. Browse - moving files between directories on hub (“os.rename(pathname,newpath”)

2. Upload / download - moving files between workstation and hub

• Hub commands: importfile and exportfile .

• Execute importfile from separate thread to handle user-canceled uploads

Concept: File “import / export” • Bringing files into and out of tool. • Two flavors:

Page 14: HUBbub 2013: Developing hub tools that submit HPC jobs Rob Campbell Purdue University Thursday, September 5, 2013.

Resource LinkRob Campbell mailto:[email protected]

Research Computingat Purdue http://www.rcac.purdue.edu

DiaGrid Hub http://diagrid.org

SubmitR https://diagrid.org/tools/submitr

Tool Developers Guide http://hubzero.org/documentation/1.1.0/tooldevs

The submit command http://hubzero.org/documentation/1.1.0/tooldevs/grid.submitcmd

Pegasus http://pegasus.isi.edu/

HTCondor http://research.cs.wisc.edu/htcondor/

Information