8/10/2019 TheCampusCluster_sp2013
1/45
The Campus Cluster
8/10/2019 TheCampusCluster_sp2013
2/45
What isthe Campus Cluster?
Batch job system
High throughput
High latency
Available resources: ~450 nodes
12 Cores/node
24-96 GB memory Shared high performance filesystem
High speed multinode message passing
8/10/2019 TheCampusCluster_sp2013
3/45
What isntthe Campus Cluster?
Not: Instantly available computation resource
Can wait up to 4 hours for a node
Not: High I/O Friendly
Network disk access can hurt performance
Not: .
8/10/2019 TheCampusCluster_sp2013
4/45
Getting Set Up
8/10/2019 TheCampusCluster_sp2013
5/45
Getting started
Request an account:https://campuscluster.illinois.edu/invest/user_form.html
Connecting:
ssh to taub.campuscluster.illinois.edu
Use netid and AD password
https://campuscluster.illinois.edu/invest/user_form.htmlhttps://campuscluster.illinois.edu/invest/user_form.htmlhttps://campuscluster.illinois.edu/invest/user_form.html8/10/2019 TheCampusCluster_sp2013
6/45
Where to put data
Home Directory ~/ Backed up, currently no quota (in future 10s of GB)
Use /scratch for temporary data - ~10TB Scratch data is currently deleted after ~3 months
Available on all nodes
No backup
/scratch.local - ~100GB Local to each node, not shared across network
Beware that other users may fill disk
/projects/VisionLanguage/ - ~15TB
Keep things tidy by creating a directory for your netid Backed up
Current Filesystem best practices (Should improve for Cluster v. 2): Try to do batch writes to one large file
Avoid many little writes to many little files
8/10/2019 TheCampusCluster_sp2013
7/45
Backup = Snapshots
(Just learned this yesterday) Snapshots taken daily
Not intended for disaster recovery Stored on same disk as data
Intended for accidental deletes/overwrites, etc. Backed up data can be accessed at:
/gpfs/ddn_snapshot/.snapshots//
e.g. recover accidentally deleted file in home directory:/gpfs/ddn_snapshot/.snapshots/2012-12-
24/home/iendres2/christmas_list
8/10/2019 TheCampusCluster_sp2013
8/45
Moving data to/from cluster
Only option right now is sftp/scp
SSHFS lets you mount a directory from remote
machines
Havent tried this, but might be useful
8/10/2019 TheCampusCluster_sp2013
9/45
Modules
[iendres2 ~]$ modules load
Manages environment, typically used to addsoftware to path:
To get the latest version of matlab:
[iendres2 ~]$ modules load matlab/7.14 To find modules such as vim, svn:
[iendres2 ~]$ modules avail
8/10/2019 TheCampusCluster_sp2013
10/45
Useful Startup Options
Appended to the end of my bashrc:
Make default permissions the same for user andgroup, useful when working on a joint project
umask u=rwx,g=rwx Safer alternativedont allow writing
umask u=rwx,g=rx
Load common modules
module load vim module load svn
module load matlab
8/10/2019 TheCampusCluster_sp2013
11/45
Submitting Jobs
8/10/2019 TheCampusCluster_sp2013
12/45
Queues
Primary (VisionLanguage)
Nodes we own (Currently 8)
Jobs can last 72 hours
We have priority access Secondary (secondary)
Anyone elses idle nodes (~500)
Jobs can only last 4 hours, automatically killed
Not unusual to wait 12 hours for job to begin runing
8/10/2019 TheCampusCluster_sp2013
13/45
Scheduler
Typically behaves as first come first serve
Claims of priority scheduling, we dont knowhow it works
8/10/2019 TheCampusCluster_sp2013
14/45
Types of job
Batch job
No graphics, runs and completes without user
interaction
Interactive Jobs
Brings remote shell to your terminal
X-forwarding available for graphics
Both wait in queue the same way
8/10/2019 TheCampusCluster_sp2013
15/45
Scheduling jobs
Batch job [iendres2 ~]$ qsub
job_script defines parameters of job and the actual
command to run
Details on job scripts to follow
Interactive Jobs [iendres2 ~]$ qsub -q -I -lwalltime=00:30:00,nodes=1:ppn=12
IncludeX for X-forwarding
Details on l parameters to follow
8/10/2019 TheCampusCluster_sp2013
16/45
Configuring Jobs
8/10/2019 TheCampusCluster_sp2013
17/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
8/10/2019 TheCampusCluster_sp2013
18/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
Queue to use:
VisionLanguage or
secondary
8/10/2019 TheCampusCluster_sp2013
19/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
Number of nodes1, unless using MPI
or other distributed programming
Processors per nodeAlways 12,
smallest computation unit is a physical
node, which has 12 cores (with current
hardware)*
*Some queues are configured to allow
multiple concurrent jobs per node, but
this is uncommon
8/10/2019 TheCampusCluster_sp2013
20/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
Maximum time job will run forit is
killed if it exceeds this
72:00:00 hours for primary queue
04:00:00 hours for secondary queue
8/10/2019 TheCampusCluster_sp2013
21/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
Bash comands are allowed anywhere in
the script and will be executed on the
scheduled worker node after all PBS
commands are handled
8/10/2019 TheCampusCluster_sp2013
22/45
Basics
Parameters of jobs are defined by a bash
script which contains PBS commands
followed by script to execute
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
echo This is job number ${PBS_JOBID}
There are some reserved variables that the scheduler
will fill in once the job is scheduled (see `man qsub`for
more variables)
8/10/2019 TheCampusCluster_sp2013
23/45
Basics
Scheduler variables (From manpage)
PBS_O_HOSTthe name of the host upon which the qsub command is running.
PBS_SERVERthe hostname of the pbs_server which qsub submits the job to.
PBS_O_QUEUEthe name of the original queue to which the job was submitted.
PBS_O_WORKDIR
the absolute path of the current working directory of the qsub command.
PBS_ARRAYIDeach member of a job array is assigned a unique identifier (see -t)
PBS_ENVIRONMENTset to PBS_BATCH to indicate the job is a batch job, or to PBS_INTERACTIVE to indicate the job is a PBS interac-tive job, see -I option.
PBS_JOBIDthe job identifier assigned to the job by the batch system.
PBS_JOBNAME
the job name supplied by the user.
PBS_NODEFILEthe name of the file contain the list of nodes assigned to the job (for parallel and cluster systems).
PBS_QUEUEthe name of the queue from which the job is executed.
There are some reserved variables that the scheduler
will fill in once the job is scheduled (see `man qsub`for
more variables)
8/10/2019 TheCampusCluster_sp2013
24/45
Monitoring Jobs
[iendres2 ~]$ qstat
Sample output:JOBID JOBNAME USER WALLTIME STATE QUEUE333885[].taubm1 r-afm-average hzheng8 0 Q secondary333899.taubm1 test6 lee263 03:33:33 R secondary333900.taubm1 cgfb-a dcyang2 09:22:44 R secondary333901.taubm1 cgfb-b dcyang2 09:31:14 R secondary333902.taubm1 cgfb-c dcyang2 09:28:28 R secondary333903.taubm1 cgfb-d dcyang2 09:12:44 R secondary333904.taubm1 cgfb-e dcyang2 09:27:45 R secondary333905.taubm1 cgfb-f dcyang2 09:30:55 R secondary333906.taubm1 cgfb-g dcyang2 09:06:51 R secondary333907.taubm1 cgfb-h dcyang2 09:01:07 R secondary333908.taubm1 ...conp5_38.namd harpole2 0 H cse333914.taubm1 ktao3.kpt.12 chandini 03:05:36 C secondary333915.taubm1 ktao3.kpt.14 chandini 03:32:26 R secondary
333916.taubm1 joblammps daoud2 03:57:06 R cse
States:
Q Queued, waiting to runR RunningH Held, by user or admin, wont run until released (see qhold, qrls)C Closed finished runningE Error this usually doesnt happen, indicates a problem with the cluster
grep is your friend for finding specific jobs
(e.g. qstat u iendres2 | grep R gives all ofmy running jobs)
8/10/2019 TheCampusCluster_sp2013
25/45
Managing Jobs
qalter, qdel, qhold, qmove, qmsg,qrerun, qrls, qselect, qsig, qstat
Each takes a jobid + some arguments
8/10/2019 TheCampusCluster_sp2013
26/45
Problem: I want to run the same job
with multiple parameters#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
./script
Solution: Create wrapper script to iterate over params
Where:
param1 = {a, b, c}
param2 = {1, 2, 3}
8/10/2019 TheCampusCluster_sp2013
27/45
Problem 2: I cant pass parameters into
my job script#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
./script
Solution 2: Hack it!
Where:
param1 = {a, b, c}
param2 = {1, 2, 3}
8/10/2019 TheCampusCluster_sp2013
28/45
Problem 2: I cant pass parameters into
my job scriptWhere:
param1 = {a, b, c}
param2 = {1, 2, 3}
We can pass parameters via the
jobname, and delimit them using
the - character (or whatever
you want)
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
# Pass parameters via jobname:
export IFS="-"
i=1
for word in ${PBS_JOBNAME}; do
echo $word
arr[i]=$word
((i++))
done
# Stuff to execute
echo Jobname: ${arr[1]}
cd ~/workdir/
echo ${arr[2]} ${arr[3]}
8/10/2019 TheCampusCluster_sp2013
29/45
Problem 2: I cant pass parameters into
my job scriptWhere:
param1 = {a, b, c}
param2 = {1, 2, 3}
qsub N job-param1-param2 job_script
qsubs-N parameter sets the job
name
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
# Pass parameters via jobname:
export IFS="-"
i=1
for word in ${PBS_JOBNAME}; do
echo $word
arr[i]=$word
((i++))
done
# Stuff to execute
echo Jobname: ${arr[1]}
cd ~/workdir/
echo ${arr[2]} ${arr[3]}
8/10/2019 TheCampusCluster_sp2013
30/45
Problem 2: I cant pass parameters into
my job script#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
# Pass parameters via jobname:
export IFS="-"
i=1
for word in ${PBS_JOBNAME}; do
echo $word
arr[i]=$word
((i++))
done
# Stuff to execute
echo Jobname: ${arr[1]}
cd ~/workdir/
echo ${arr[2]} ${arr[3]}
Where:
param1 = {a, b, c}
param2 = {1, 2, 3}
qsub N job-param1-param2 job_script
Output would be:
Jobname: jobparam1 param2
8/10/2019 TheCampusCluster_sp2013
31/45
Problem: I want to run the same job
with multiple parameters#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
# Pass parameters via jobname:
export IFS="-"
i=1
for word in ${PBS_JOBNAME}; do
echo $word
arr[i]=$word
((i++))
done
# Stuff to execute
echo Jobname: ${arr[1]}
cd ~/workdir/
echo ${arr[2]} ${arr[3]}
Where:
param1 = {a, b, c}
param2 = {1, 2, 3}
#!/bin/bash
param1=({a,b,c})
param2=({1,2,3}) # or {1..3}
for p1 in ${param1[@]}; do
for p2 in ${param2[@]}; do
qsub N job-${p1}-${p2} job_script
done
done
Now Loop!
8/10/2019 TheCampusCluster_sp2013
32/45
Problem 3: My job isnt multithreaded,
but needs to run many times#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
./script ${idx} Solution: Run 12 independentprocesses on the same node so 11
CPUs dont sit idle
8/10/2019 TheCampusCluster_sp2013
33/45
Problem 3: My job isnt multithreaded,
but needs to run many times#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
# Run 12 jobs in the backgroundfor idx in {1..12}; do
./script ${idx} & # Your job goes here (keep the ampersand)
pid[idx]=$! # Record the PID
done
# Wait for all the processes to finish
for idx in {1..12}; do
echo waiting on ${pid[idx]}
wait ${pid[idx]}
done
Solution: Run 12 independent
processes on the same node so 11
CPUs dont sit idle
8/10/2019 TheCampusCluster_sp2013
34/45
Matlab and The Cluster
8/10/2019 TheCampusCluster_sp2013
35/45
Simple Matlab Sample
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
matlab -nodisplay -r matlab_func(); exit;
8/10/2019 TheCampusCluster_sp2013
36/45
Matlab Sample: Passing Parameters
#PBS -q VisionLanguage
#PBS -l nodes=1:ppn=12
#PBS -l walltime=04:00:00
cd ~/workdir/
param = 1param2 = \string\ # Escape string parameters
matlab -nodisplay -r matlab_func(${param}); exit;
8/10/2019 TheCampusCluster_sp2013
37/45
8/10/2019 TheCampusCluster_sp2013
38/45
Compiling Matlab Code
Doesnt use any matlab licenses once compiled
Compiles matlab code into a standalone executable
Constraints:
Code cant call addpath
Functions called by eval, str2func, or other implicit methods must beexplicitly identified e.g. for eval(do_this) to work, must also include %#function do_this
To compile (within matlab):
>> addpath(everything that should be included)
>> mcc m function_to_compile.m
isdeployed() is useful for modifying behavior for compiled applications
(returns true if code is running the compiled version)
8/10/2019 TheCampusCluster_sp2013
39/45
Running Compiled Matlab Code
Requires Matlab compiler runtime
>> mcrinstaller % This will point you to the installer and help install it
% make note of the installed path MCRPATH (e.g. /mcr/v716/)
Compiled code generates two files:
function_to_compile and run_function_to_compile.sh
To run:
*iendres2 ~+$ ./run_function_to_compile.sh MCRPATH param1 param2 paramk
Params will be passed into matlab function as usual, except they will always be strings
Useful trick:function function_to_compile(param1, param2, , paramk)
if(isdeployed)
param1 = str2num(param1);
%param2 expects a string
paramk = str2num(paramk);
end
8/10/2019 TheCampusCluster_sp2013
40/45
Parallel For Loops on the Cluster
Not designed for multiple nodes on shared
filesystem:
Race condition from concurrent writes to:
~/.matlab/local_scheduler_data/
Easy fix: redirect directory to /scratch.local
8/10/2019 TheCampusCluster_sp2013
41/45
Parallel For Loops on the Cluster
1. Setup (done once, before submitting jobs):[iendres2 ~]$ ln sv/scratch.local/tmp/USER/matlab/local_scheduler_data
~/.matlab/local_scheduler_data
(Replace USER with your netid)
8/10/2019 TheCampusCluster_sp2013
42/45
Parallel For Loops on the Cluster
2. Wrap matlabpool function to make sure tmp data exists:
function matlabpool_robust(varargin)
if(matlabpool('size')>0)matlabpool close
end
% make sure the directories exist and are empty for good measuresystem('rm -rf /scratch.local/tmp/USER/matlab/local_scheduler_data');
system(sprintf('mkdir -p/scratch.local/tmp/USER/matlab/local_scheduler_data/R%s', version('-release')));
% Run it:
matlabpool(varargin{:});
Warning:
/scratch.local may get filled up by other users, in which case this
will fail.
8/10/2019 TheCampusCluster_sp2013
43/45
Best Practices
Interactive Sessions Dont leave idle sessions open, it ties up the nodes
Job arrays
Still working on kinks in the scheduler, I managedto kill the whole cluster
Disk I/O
Minimize I/O for best performance Avoid small reads and writes due to metadata
overhead
8/10/2019 TheCampusCluster_sp2013
44/45
Maintenance
Preventive maintenance (PM) on the cluster is
generally scheduled on a monthly basis on the third
Wednesday of each month from 8 a.m. to 8 p.m.
Central Time. The cluster will be returned to serviceearlier if maintenance is completed before schedule.
8/10/2019 TheCampusCluster_sp2013
45/45