Small Tutorial on HTCondor - stat.wisc.educlaudia/condorTalk.pdf · SmallTutorialonHTCondor ClaudiaSolís-Lemus1 March23,2015 1based on slides by Steve Hunter and Lauren Michael

Post on 02-Mar-2018

217 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Small Tutorial on HTCondor

Claudia Solís-Lemus1

March 23, 2015

1based on slides by Steve Hunter and Lauren Michael

HTCondor: in its simplest form

CSL HTCondor March 23, 2015 2 / 33

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

When to use HTCondor?

Many independent jobs to run in parallel: high-throughput computing(HTC):

your computing work won’t run at all on your computer(s) (lacksufficient RAM, disk,...)your computing work will take too long on your own computer(s)simply would like to off-load certain processes in favor of runningothers on your computer(s)

Ideal running time for a single job: between 5 minutes and 2 hours -but CHTC pool allows up to 72h (even longer jobs possible)

Keep in mind: high-performance computing (HPC) also available inCHTC: http://chtc.cs.wisc.edu/HPCuseguide.shtml

CSL HTCondor March 23, 2015 3 / 33

Real-life example in HTCondor: phylogenetics

Run 1800 quartets, eachwith 4000 genesEach gene: 2 minutesEach quartet: 5.5 daysAll quartets: 30 yearsWith HTCondor: 9 days

CSL HTCondor March 23, 2015 4 / 33

HTCondor: in its simplest form

Job Computer

HTCondor

CSL HTCondor March 23, 2015 5 / 33

HTCondor: in its simplest form

Job Computer

Matchmaker

CSL HTCondor March 23, 2015 6 / 33

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

How does HTCondor know what you want/need?

Submit file

CSL HTCondor March 23, 2015 7 / 33

Simple submit file

CSL HTCondor March 23, 2015 8 / 33

Not so simple submit file

CSL HTCondor March 23, 2015 9 / 33

Not so simple submit file

kbMb

CSL HTCondor March 23, 2015 10 / 33

Another submit file

CSL HTCondor March 23, 2015 11 / 33

Submit to HTCondor

CSL HTCondor March 23, 2015 12 / 33

Check the queue in HTCondor

CSL HTCondor March 23, 2015 13 / 33

Submit to HTCondor: commands

condor_submit submitfilecondor_q usernamecondor_rm jobID or username

CSL HTCondor March 23, 2015 14 / 33

Submit to HTCondor

Submit

nodeExecute

node

HTCondor

desk22 (where job runs)

CSL HTCondor March 23, 2015 15 / 33

CHTC pools

CSL HTCondor March 23, 2015 16 / 33

CHTC pools

simon → Mike Camilleri mikec@stat.wisc.edu

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

CHTC pools

simon → Mike Camilleri mikec@stat.wisc.edu

CHTC → chtc.cs.wisc.edu, click "Get Started"

CSL HTCondor March 23, 2015 17 / 33

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Checklist

X Input filesX My codeX Submit fileX condor_submit submitfileX Monitor/remove jobs

Is it that simple?

CSL HTCondor March 23, 2015 18 / 33

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

Code in:

C++Java

MatlabR

CSL HTCondor March 23, 2015 19 / 33

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 20 / 33

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 21 / 33

Steps to run R code in HTCondor

1 Create R script2 Take your R version and libraries with you3 Download the ChtcRun package and follow theinstructions

4 Run mkdag and submit jobs

CSL HTCondor March 23, 2015 22 / 33

Steps to run R code in HTCondor

Step 1: R script

Create your R script

Debug and test thoroughly locally

Organize your input files

Keep in mind what output files are you expecting

Warning: no folder structure in execute node

CSL HTCondor March 23, 2015 23 / 33

Steps to run R code in HTCondor

Step 2: R version,libraries

R version (list of options in CHTC website)List of needed R packages (order important): copy source .tar.gz filesYou need to be in a CHTC (or simon) submit node

CSL HTCondor March 23, 2015 24 / 33

Steps to run R code in HTCondor

Step 3: ChtcRun package

CSL HTCondor March 23, 2015 25 / 33

Steps to run R code in HTCondor

Step 3: ChtcRun package

Organize your input files as:ChtcRun/

Rin/0/ infile.txt, <specific files>1/ infile.txt, <specific files>job2/ infile.txt, <specific files>shared/ RLIBS.tar.gz, yourScript.r, <shared files>

CSL HTCondor March 23, 2015 26 / 33

Steps to run R code in HTCondor

Step 3: ChtcRun package

Modify process.template with:

request memory, request disk (make small test first)

wantFlocking? wantGlidein?

CSL HTCondor March 23, 2015 27 / 33

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Steps to run R code in HTCondor

Step 4: Run mkdag

Run ./mkdag with options

See ./mkdag --help for examples

./mkdag --data=Rin --outputdir=Rout --cmdtorun=soartest.R--pattern=meanx --type=R --version=R-2.10.1

Message: All done!

CSL HTCondor March 23, 2015 28 / 33

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Steps to run R code in HTCondor

Step 4: Submit your jobs

mkdag will instruct you:

cd Routcondor_submit_dag mydag.dag

Don’t forget to monitor your jobs: condor_q and check log files formemory/disk requirements

CSL HTCondor March 23, 2015 29 / 33

Important things to remember!

Test your code locallyThere is no folder structure in execute nodeTest into HTCondor gradually: 3 jobs, 100 jobs, 1000 jobs...Adjust memory/disk requirementsAvoid transferring large files through submit node (> 10GB perbatch, ∼ 10MB per job, 1000 jobs

CSL HTCondor March 23, 2015 30 / 33

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Workflow - HTCondor DAGman

http://research.cs.wisc.edu/htcondor/dagman/dagman.html

CSL HTCondor March 23, 2015 31 / 33

Help resources

chtc.cs.wisc.edu

chtc@cs.wisc.edu

research.cs.wisc.edu/htcondor/tutorials/

CHTC office hours: Wednesday 9:30-11:30am

CSL HTCondor March 23, 2015 32 / 33

chtc.cs.wisc.edu

CSL HTCondor March 23, 2015 33 / 33

top related