Top Banner
Small Tutorial on HTCondor Claudia Solís-Lemus 1 March 23, 2015 1 based on slides by Steve Hunter and Lauren Michael
44

Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Mar 02, 2018

Download

Documents

phungngoc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Small Tutorial on HTCondor

Claudia Solís-Lemus1

March 23, 2015

1based on slides by Steve Hunter and Lauren Michael

Page 2: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

CSL HTCondor March 23, 2015 2 / 33

Page 3: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 4: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 5: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 6: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 7: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Page 8: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Real-life example in HTCondor: phylogenetics

Run 1800 quartets, eachwith 4000 genesEach gene: 2 minutesEach quartet: 5.5 daysAll quartets: 30 yearsWith HTCondor: 9 days

CSL HTCondor March 23, 2015 4 / 33

Page 9: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

Job Computer

HTCondor

CSL HTCondor March 23, 2015 5 / 33

Page 10: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

Job Computer

Matchmaker

CSL HTCondor March 23, 2015 6 / 33

Page 11: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

Page 12: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

Page 13: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Simple submit file

CSL HTCondor March 23, 2015 8 / 33

Page 14: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Not so simple submit file

CSL HTCondor March 23, 2015 9 / 33

Page 15: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Not so simple submit file

kbMb

CSL HTCondor March 23, 2015 10 / 33

Page 16: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Another submit file

CSL HTCondor March 23, 2015 11 / 33

Page 17: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor

CSL HTCondor March 23, 2015 12 / 33

Page 18: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Check the queue in HTCondor

CSL HTCondor March 23, 2015 13 / 33

Page 19: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor: commands

condor_submit submitfilecondor_q usernamecondor_rm jobID or username

CSL HTCondor March 23, 2015 14 / 33

Page 20: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Submit to HTCondor

Submit

nodeExecute

node

HTCondor

desk22 (where job runs)

CSL HTCondor March 23, 2015 15 / 33

Page 21: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

CSL HTCondor March 23, 2015 16 / 33

Page 22: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

simon → Mike Camilleri [email protected]

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

Page 23: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

CHTC pools

simon → Mike Camilleri [email protected]

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

Page 24: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Page 25: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Page 26: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

Page 27: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

Page 28: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 20 / 33

Page 29: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 21 / 33

Page 30: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

1 Create R script2 Take your R version and libraries with you3 Download the ChtcRun package and follow theinstructions

4 Run mkdag and submit jobs

CSL HTCondor March 23, 2015 22 / 33

Page 31: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 1: R script

Create your R script

Debug and test thoroughly locally

Organize your input files

Keep in mind what output files are you expecting

Warning: no folder structure in execute node

CSL HTCondor March 23, 2015 23 / 33

Page 32: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 2: R version,libraries

R version (list of options in CHTC website)List of needed R packages (order important): copy source .tar.gz filesYou need to be in a CHTC (or simon) submit node

CSL HTCondor March 23, 2015 24 / 33

Page 33: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

CSL HTCondor March 23, 2015 25 / 33

Page 34: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

Organize your input files as:ChtcRun/

Rin/0/ infile.txt, <specific files>1/ infile.txt, <specific files>job2/ infile.txt, <specific files>shared/ RLIBS.tar.gz, yourScript.r, <shared files>

CSL HTCondor March 23, 2015 26 / 33

Page 35: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 3: ChtcRun package

Modify process.template with:

request memory, request disk (make small test first)

wantFlocking? wantGlidein?

CSL HTCondor March 23, 2015 27 / 33

Page 36: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Page 37: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Page 38: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Page 39: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Page 40: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Important things to remember!

Test your code locallyThere is no folder structure in execute nodeTest into HTCondor gradually: 3 jobs, 100 jobs, 1000 jobs...Adjust memory/disk requirementsAvoid transferring large files through submit node (> 10GB perbatch, ∼ 10MB per job, 1000 jobs

CSL HTCondor March 23, 2015 30 / 33

Page 41: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Page 42: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Page 43: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Help resources

chtc.cs.wisc.edu

[email protected]

research.cs.wisc.edu/htcondor/tutorials/

CHTC office hours: Wednesday 9:30-11:30am

CSL HTCondor March 23, 2015 32 / 33

Page 44: Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 33 / 33