Computing Workshop Computing Workshop for for Users of NCAR’s SCD Users of NCAR’s SCD machines machines Christiane Jablonowski ([email protected]) NCAR ASP/SCD 31 January 2006 ML Mesa Lab, Chapman Room video conference facilities: FL EOL Atrium and CG1 3150
46
Embed
Computing Workshop for Users of NCAR’s SCD machines Christiane Jablonowski ([email protected]) NCAR ASP/SCD 31 January 2006 ML Mesa Lab, Chapman Room video.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Computing Workshop Computing Workshop for for
Users of NCAR’s SCD machinesUsers of NCAR’s SCD machines
ML Mesa Lab, Chapman Roomvideo conference facilities: FL EOL Atrium and CG1 3150
OverviewOverview
Current machine architectures at NCAR (SCD) Some basics on parallel computing Batch queuing systems at NCAR GAU resources & how to obtain a GAU account Insights into GAU charges The Mass Storage System How to monitor the GAUs Some practical tips on benchmarks, debugging tools,
restarts… ???
Computer architecturesComputer architectures
SCD’s machines are UNIX-based parallel computing architectures
Two types:– Hybrid (shared and distributed memory) machines like
processors on a node have direct access to memory– Nodes are connected via the network (distributed memory)
MPI exampleMPI example
Processors communicate via messages
MPI ExampleMPI Example
Initialize & finalize MPI in your program via function/subroutine calls to the MPI library. Examples include:MPI_Init, MPI_Comm_rank, MPI_Comm_size, MPI_Finalize
Example fromprevious pagein C notation(unoptimized):
Important to note: such an operation (computing a global sum) is very common, therefore MPI provides a highly optimized function, also called a ‘reduction operation’ MPI_Reduce (…) that can replace the example above
Example: domain decompositions for MPIExample: domain decompositions for MPI
Each color presentsa processor
OpenMP ExampleOpenMP Example
Parallel loops via compiler directives (here: in Fortran notation)Before program is called set:setenv OMP_NUM_THREADS #proc
Add compiler directives in your code:!$OMP PARALLEL DODO i = 1, n a(i) = b(i) + c(i)END DO!$OMP END PARALLEL DO
master thread
master thread
team
Assume n=1000 & #proc=4: The loop will be split into 4 ‘threads’ that run in parallel with loop indices 1…250, 251…500, 501…750, 751…1000
SCD’s machinesSCD’s machines Bluesky (web page)
– ‘Oldest’ machine at NCAR (2002)– Lots of user experience at NCAR, easy access to help– CAM/CCSM/WRF are set up for this
architecture (Makefiles)– Batch queuing system LoadLeveler,
short interactive runs possible– Batch queues are listed under
Bluevista (web page)– Newest machine on the floor (Jan. 2006)– CAM/CCSM/WRF are (probably) set up for this architecture– Batch queuing system LSF (Load Sharing Facility)– Queue names different from bluesky: premium, regular, economy,
ASP GAU account number: 54042108 (also project number) Needs to be specified in the batch job scripts ASP account number is not your default account number Therefore: everybody needs a second (default) GAU account:
– divisional GAU account– so-called University account (small request form for 1500
GAUs http://www.cisl.ucar.edu/resources/compServ.html)these GAUs do not expire every month, one-time allocation
Second GAU account should be used for the accumulating MSS charges– automatic when using CAM / CCSM’s MSS option
GAU charges on SCD’s supercomputersGAU charges on SCD’s supercomputers
You are charged GAUs for how much time you use a processor (on bluesky, bluevista, lightning, tempest)
On bluesky, there are actually two formulas:– Shared-node usage:
GAUs charged = CPU hours used computer factor class charging factor
– Dedicated-node usage:GAUs charged = wallclock hours used
number of nodes used number of processors in that
node computer factor
class charging factor Slides on GAU charges: Modified from an earlier Slides on GAU charges: Modified from an earlier presentation by George Bryan, NCAR MMMpresentation by George Bryan, NCAR MMM
““Number of nodes used” andNumber of nodes used” and“Number of processors in that node”“Number of processors in that node”
““CPU hours used” and “Wallclock CPU hours used” and “Wallclock hours used”hours used”
Measure of how long you “used” a processor NOTE: This includes all time you were allocated the
use of a processor, whether you actually used it or not
Example: you used two 8-processor nodes on bluesky. The job started at 1:00 PM and finished at 2:30 PM.
You are charged for 1.5 hrs
““Computer factor”Computer factor”
A measure of how powerful a computer is– Bluesky: 0.24– Bluevista: 0.5– Lightning: 0.34
This “levels the playing field”
““Class charging factor”Class charging factor”
Tied to queuing system: “How quickly do you want your results, and how much are you willing to pay for it?”
Current setting on all SCD supercomputers:– Premium = 1.5 (highest priority, fastest turnaround)– Regular = 1.0– Economy = 0.5– Standby = 0.1 (lowest priority, slow turnaround)
ExampleExample
Recall dedicated-node usage on bluesky GAUs charged = wallclock hours used number of
nodes used number of processors in that node computer factor class charging factor
1.5 hours using two 8-processor nodes Bluesky regular queue GAUs used = 1.5 2 8 0.24 1.0
= 5.76 GAUs In premium queue, this would be 8.64 GAUs In standby queue, this would be 0.576 GAUs
Recommendations: Queuing systemsRecommendations: Queuing systems
Check the queue before you submit any job:– If the queue is not busy, try using the standby or economy
queues The queue tends to be “emptier” evenings, weekends,
and holidays Job will start sooner when specifying a wallclock limit in
the job script (scheduler tries to ‘squeeze in’ short jobs) The less processors you request, the sooner you start Use the premium queue sparingly
– Short debug jobs (there is also a special debug queue on lightning)
– When that conference paper is due
Recommendations: Recommendations: # of processors vs. run times# of processors vs. run times
If you are using more processors, you might wait longer in the queue, but usually the actual runtime of your job is reduced
Caveat: it usually costs more GAUs Example: you run the same job, but using
– Using 8 processors, the job ran in 24 hours– Using 64 processors, the job ran in 4 hours
– 1st example used 46 GAUs– 2nd example used 61 GAUs
The Mass Storage SystemThe Mass Storage System
MSS: Mass storage system (disks and cartridges) for your big data sets
MSS connected to the SCD machines, sometimes also to divisional computers
MSS user have directories like mss:/LOGIN_NAME/ Quick online reference (mss commands):
http://www.cisl.ucar.edu/docs/mss/mss-commandlist.html You are charged GAUs for using the MSS The GAU equation for MSS is more complicated ....
GAUs charged = .0837 R + .0012 A + N (.1195 W + .2050
S) where:
– R = Gigabytes read– W = Gigabytes created or written– A = Number of disk drive or tape cartridge accesses– S = Data stored, in gigabyte-years– N = Number of copies of file: 1 if economy
reliability selected; 2 if standard reliability selected
MSS ChargesMSS Charges
Recommendations: Recommendations: The MSSThe MSS
MSS charges seem small, but they add up!
Examples: FY04 MSS usage– ACD: 24,000 of 60,000 GAUs– CGD: 94,500 of 181,000 GAUs– HAO: 22,000 of 122,000 GAUs– MMM: 34,000 of 139,000 GAUs– RAP: 32,000 of 35,000 GAUs
Recommendations: Recommendations: The MSSThe MSS
Recommendation for ASP users: – use an account in your home division or your
so-called ‘university’ account (1500 GAUs for postdocs, you need to apply) for MSS charges
– leave ASP GAUs for supercomputing
GAU Usage Strategy: 30-day and GAU Usage Strategy: 30-day and 90-day averages90-day averages
The allocation actually works through 30-day and 90-day averages
Limits: 120% for 30-day use105% for 90-day use
It is helpful to spread usage out evenly How to check GAU usage:
– Type “charges” on command line of a supercomputer– Check the “daily summary” output (next page)– SCD Portal: look for the link on SCD’s main page:
Web page: http://www.cisl.ucar.edu/dbsg/dbs/ASP/ASP 30 Day Percent = 57.0 % ASP 90 Day Percent = 48.3 %30 Day Allocation = 3850 90 Day Allocation = 1155030 Day Use = 2193 90 Day Use = 5575
90 DAY ST -- 30 DAY ST -- LAST DAY 01-NOV-05 31-DEC-05 29-JAN-06
SCD PortalSCD Portal Online tool that helps you monitor the GAU charges and the current
machine status (e.g. batch queues), display can be customized Information on the machine status requires a setup-command on
roy.scd.ucar.edu via the crypto-card access, just enter ‘scdportalkey hostname’ (e.g. lightning) after logging on with the crypto-card
At this time (Jan/31/2006) the GAU charges on bluevista are not itemized: will be included in the next release in Spring 2006
Other IBM resourcesOther IBM resources
Sources of information on the IBM machines bluesky (from the command line), batchview also works on bluevista & lightning– batchview for overview of jobs with their rankings– llq for list of all submitted jobs, no ranking– spinfo : queue limits, memory quotas on home file system and
the temporary file system /ptmp
– Useful IBM LoadLeveler keywords in the script:#@account_no=54042108 -> ASP account #@ja_report=yes -> job report (see
Operating System : blackforest AIX51 User Name (ID) : cjablono (7568) Group Name (ID) : ncar (100) Account Name : 54042108 Job Name : bf0913en.26921 Job Sequence Number : bf0913en.26921 Job Starts : 12/20/04 17:56:33 Job Ends : 12/20/04 23:26:34 Elapsed Time (Wall-Clock * #CPU): 633632 s Number of Nodes (not_shared) : 8 Number of CPUs : 32 Number of Steps : 1
IBM Job Report (continued)IBM Job Report (continued)
Charge Components Wall-clock Time : 5:30:01 Wall-clock CPU hours : 176.00889 hrs Multiplier for com_ec Queue : 0.50 Charge before Computer Factor : 88.00444 GAUs
Multiplier for computer blackforest: 0.10 Charged against Allocation : 8.80044 GAUs Project GAUs Allocated : 5000.00 GAUs Project GAUs Used, as of 12/16/04:1889.20 GAUs Division GAUs 30-Day Average : 103.3% Division GAUs 90-Day Average : 58.6%
How to increase the efficiencyHow to increase the efficiency Get a feel for the GAUs for long jobs: benchmark the application on
target machine– Run a short but relevant test problem and measure the run time
(wall clock time) via MPI commands (function MPI_WTIME) or UNIX timing commands like time or timex (output formats are shell-script dependent)
– Vary number of processors to assess the scaling– If application scales poorly, avoid using a large number of
processors (waste of GAUs), instead use smaller number with numerous restarts
– Make sure your job fits into the queue (finishes before the max. time is up)
Use compiler options, especially the optimization options In case of programming problems: the Totalview debugger can save
you days, weeks or even monthson the IBM’s: compile your program with the compiler options:-g -qfullpath -d
RestartsRestarts
Restart files are important for long simulations– Queue limits are up to 6 wallclock hours (hard
limit, job fails afterwards), then a restart becomes necessary
– Get information on the queue limits (SCD web page) and select the job’s integration time accordingly
– Restarts built into CAM/CCSM/WRF, must only be activated
– Restarts for other user applications must probably be programmed