Top Banner
Condor Project Computer Sciences Department University of Wisconsin-Madison Running Interpreted Jobs
22

Running Interpreted Jobs

Feb 23, 2016

Download

Documents

aneko

Running Interpreted Jobs. Overview. Many folks running Matlab, R, etc. Interpreters complicate Condor jobs Let’s talk about best practices. What’s R ?. #!/ usr /bin/R X
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Running Interpreted Jobs

Condor ProjectComputer Sciences DepartmentUniversity of Wisconsin-Madison

Running Interpreted Jobs

Page 2: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Overview› Many folks running Matlab, R, etc.

› Interpreters complicate Condor jobs

› Let’s talk about best practices.

Page 3: Running Interpreted Jobs

www.cs.wisc.edu/Condor

What’s R?#!/usr/bin/RX <- c(5, 7, 9)cat (X)

What could possibly go wrong?

Page 4: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Submit fileuniverse = vanillaexecutable = foo.routput = output_fileerror = error_filelog = logqueue

Page 5: Running Interpreted Jobs

www.cs.wisc.edu/Condor

What’s so hard?#!/usr/bin/R

What if /usr/bin/R isn’t there?

#!/usr/bin/env R isn’t good enough -- Condor doesn’t set the PATH for a Condor job.

Page 6: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Pre-staging:One (not-so-good)

solutionIf you control the site, pre-stage R

#!/software/R/bin/R

› Fragile!

Page 7: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Pre-staging:If you must…

“test and advertise”Use a Daemon ClassAd hook like:STARTD_CRON_JOBLIST = R_INFOSTARTD_CRON_R_INFO_PREFIX =STARTD_CRON_R_INFO_EXECUTABLE = \ $

(STARTD_CRON_MODULES)/r_infoSTARTD_CRON_R_INFO_PERIOD = 1hSTARTD_CRON_R_INFO_MODE = periodicSTARTD_CRON_R_INFO_RECONFIG = falseSTARTD_CRON_R_INFO_KILL = trueSTARTD_CRON_R_INFO_ARGS =

Page 8: Running Interpreted Jobs

www.cs.wisc.edu/Condor

#!/bin/shif [[ -d /path/to/r/bin &&

/path/to/R/bin/R –version > /dev/null ]]then

echo “has_r = true”fi

What about multiple installations of R ?

R_info script contents

Page 9: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Pre-staging is bad› Limits where your job can run› Must be an administrator to set up› Difficult to change

h Pre-staged files can change unexpectedly

– Upgrade, new system installation, disk problems, …

Page 10: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Solution: take it with you

› Bundle up the whole runtime› Transfer the bundle with the job› Wrapper script unbundles and runs› Downsides:

h Extra time overhead to unbundleh Not so good for short* jobs

Page 11: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Benefits› Can run anywhere*:

h Flocked, Campus Grids, OSG, etc.› Each job can have own runtime

version/configuration.

Page 12: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Revised submit fileuniverse = vanillaexecutable = wrapper.shoutput = output_fileerror = error_filetransfer_input_files = runtime.tar.gz, foo.r

should_transfer_files = truewhen_to_transfer_output = on_exitlog = logqueue

Page 13: Running Interpreted Jobs

www.cs.wisc.edu/Condor

wrapper.sh#!/bin/shtar xzf runtime.tar.gz./bin/R foo.r

Page 14: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Downside: Those Huge Runtimes

› Full R, matlab runtime 100 Mbh Adds up when running thousands of

jobs

› Trivia: How long to transfer 100 Mb?h Is this really a problem?

Page 15: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Mitigating Huge Runtimes

1. Trim the bundle down (identify unneeded files with strace)

2. Second, perhaps > 1 task per job

Finally, cache with Squid

Page 16: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Users, not admins

Page 17: Running Interpreted Jobs

www.cs.wisc.edu/Condorhttp://condor-wiki.cs.wisc.edu

http://condor-wiki.cs.wisc.edu

Page 18: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Using HTTP/Squidh Change wrapper to manually wgeth Set env http_proxy to squid source

• OSG_SQUID_LOCATION in OSG• Otherwise, set with Daemon ClassAd hooks and $$

h Cut runtime.tar.gz from transfer_input_files, add wget –retry-connrefused –waitretry=10 your_http_server

h To the wrapper script – note retriesDon’t use curl!h Or set –H pragma

Page 19: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Matlab complications› Licensing…

h Octave (?)h Matlab compiler!

› Matlab parallel toolkith HTPC

Page 20: Running Interpreted Jobs

www.cs.wisc.edu/Condor

Cross Platform submit› Many grids > 1 platform:

h Unix vs. Windows; 32 vs 64 bit› Huge benefit of High Level

language:h Write once, run, … well…

› Use Condor $$ to expand:

Page 21: Running Interpreted Jobs

www.cs.wisc.edu/Condor

executable = wrapper.$$(OPSYS).bat

› Condor will expand OPSYS to LINUX or WINNT<XX>

› Write both wrappers, make sure to wget correct runtime

Page 22: Running Interpreted Jobs

www.cs.wisc.edu/Condor

SummaryMany folks running lots of interpreted

jobsTransferring runtime along beneficial,

but requires set upCross platform submits can be huge

win