Top Banner
Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing
56

Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

Dec 18, 2015

Download

Documents

Cassandra Mason
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

Using Kure and TopsailUsing Kure and Topsail

Mark ReedGrant MurphyCharles Davis

ITS Research Computing

Page 2: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

2

Compute Clusters• Topsail• Kure

Logging In File Spaces User Environment and

Applications, Compiling Job Management

OutlineOutline

Page 3: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

3

LogisticsLogistics

Course Format Lab Exercises Breaks UNC Research Computing

• http://its.unc.edu/research

Getting started Topsail page• http://help.unc.edu/6214

Getting started Kure page• http://help.unc.edu/ccm3_015682

Page 4: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

What is a compute cluster?What exactly is Topsail? Kure?

What is a compute cluster?What exactly is Topsail? Kure?

Page 5: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

5

What is a compute cluster?

What is a compute cluster?

Some Typical Components Compute Nodes Interconnect Shared File System Software Operating System (OS) Job Scheduler/Manager Mass Storage

Page 6: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

6

Compute Cluster Advantages

Compute Cluster Advantages

fast interconnect, tightly coupled aggregated compute resources large (scratch) file spaces installed software base scheduling and job management high availability data backup

Page 7: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

7

Initial Topsail ClusterInitial Topsail Cluster

Initially: 1040 CPU Dell Linux Cluster• 520 dual socket, single core nodes

Infiniband interconnect Intended for capability research Housed in ITS Franklin machine room Fast and efficient for large

computational jobs

Page 8: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

8

Topsail Upgrade 1Topsail Upgrade 1

Topsail upgraded to 4,160 CPU• replaced blades with dual socket, quad core

Intel Xeon 5345 (Clovertown) Processors• Quad-Core with 8 CPU/node

Increased number of processors, but decreased individual processor speed (was 3.6 GHz, now 2.33)

Decreased energy usage and necessary resources for cooling system

Summary: slower clock speed, better memory bandwidth, less heat, quadrupled the core count• Benchmarks tend to run at the same speed per core• Topsail shows a net ~4X improvement• Of course, this number is VERY application dependent

Page 9: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

9

Topsail – Upgraded blades

Topsail – Upgraded blades

52 Chassis: Basis of node names• Each holds 10 blades -> 520 blades total• Nodes = cmp-chassis#-blade#

Old Compute Blades: Dell PowerEdge 1855• 2 Single core Intel Xeon EMT64T 3.6 GHZ procs• 800 Mhz FSB• 2MB L2 Cache per socket• Intel NetBurst MicroArchitecture

New Compute Blades: Dell PowerEdge 1955• 2 Quad core Intel 2.33 GHz procs• 1333 Mhz FSB• 4MB L2 Cache per socket• Intel Core 2 MicroArchitecture

Page 10: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

10

Topsail Upgrade 2Topsail Upgrade 2

Most recent Topsail upgrade (Feb/Mar ‘09)

Refreshed much of the infrastructure Improved IBRIX filesystem Replaced and improved Infiniband

cabling Moved cluster to ITS-Manning building

• Better cooling and UPS

Page 11: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

11

Top 500 HistoryTop 500 History

Top 500 lists comes out twice a year• ISC conference in June• SC conference in Nov

Topsail debuted at 74 in June 2006 Peaked at 25 in June 2007 Still in the Top 500

Page 12: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

12

Current Topsail Architecture

Current Topsail Architecture

Login node: 8 CPU @ 2.3 GHz Intel EM64T, 12 GB memory

Compute nodes: 4,160 CPU @ 2.3 GHz Intel EM64T, 12 GB memory

Shared disk: 39TB IBRIX Parallel File System

Interconnect: Infiniband 4x SDR 64bit Linux Operating System

Page 13: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

13

Multi-Core ComputingMulti-Core Computing

Processor Structure on Topsail• 500+ nodes• 2 sockets/node• 1 processor/socket• 4 cores/processor

(Quad-core)• 8 cores/node

http://www.tomshardware.com/2006/12/06/quad-core-xeon-clovertown-rolls-into-dp-servers/page3.html

Page 14: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

14

Multi-Core ComputingMulti-Core Computing

The trend in High Performance Computing is towards multi-core or many core computing.

More cores at slower clock speeds for less heat

Now, dual and quad core processors are becoming common.

Soon 64+ core processors will be common• And these may be heterogeneous!

Page 15: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

15

The Heat ProblemThe Heat Problem

Taken From: Jack Dongarra, UT

Page 16: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

16

More ParallelismMore Parallelism

Taken From: Jack Dongarra, UT

Page 17: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

17

Infiniband Connections

Infiniband Connections

Connection comes in single (SDR), double (DDR), and quad data rates (QDR). • Topsail is SDR.

Single data rate is 2.5 Gbit/s in each direction per link.

Links can be aggregated - 1x, 4x, 12x. • Topsail is 4x.

Links use 8B/10B encoding —10 bits carry 8 bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.

Data rate for Topsail is 8 GB/s (4x SDR).

Page 18: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

18

Topsail Network Topology

Topsail Network Topology

Page 19: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

19

Infiniband Benchmarks

Infiniband Benchmarks

Point-to-point (PTP) intranode communication on Topsail for various MPI send types

Peak bandwidth:• 1288 MB/s

Minimum Latency (1-way):• 3.6 ms

Page 20: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

20

Infiniband Benchmarks

Infiniband Benchmarks

Scaled aggregate bandwidth for MPI Broadcast on Topsail

Note good scaling throughout the tested range (from 24-1536 cores)

Page 21: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

21

KureKure

The newest, “latest and greatest” compute cluster in RC

Named after the beach in North Carolina

It’s pronounced like the Nobel prize winning physicist and chemist, Madame Curie

Page 22: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

22

Kure Compute ClusterKure Compute Cluster

Heterogeneous Research Cluster Hewlett Packard Blades 79 Compute Nodes, mostly

• Xeon 5560 2.8 GHz• Nehalem Microarchitecture• Dual socket, quad core• 48 GB memory• over 600 cores• some higher memory nodes

Infiniband 4x QDR priority usage for patrons

• Buy in is cheap Storage

• Scratch space same as emerald• No AFS home

Page 23: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

23

Kure Cont.Kure Cont.

The current configuration of Kure is mostly homogeneous but it will become increasingly heterogeneous as patrons and others add to it.

Most login nodes are 48 GB but there are currently four high memory nodes

2 nodes each with 128 GB of memory 2 nodes each with 96 GB of memory

Page 24: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

24

Topsail/Kure ComparisonTopsail/Kure Comparison

Topsail homogeneous 4000+ cores 2.33 GHz cores, Intel

Core microarch. 12 GB memory/node IB 4x SDR

interconnect

Kure heterogeneous 600+ cores 2.8 Ghz cores, Intel

Nehalem micorarch. 48 GB memory/node IB 4x QDR

interconnect

Page 25: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

25

Login to Topsail/KureLogin to Topsail/Kure

Use ssh to connect:• ssh topsail.unc.edu• ssh kure.unc.edu

SSH Secure Shell with Windows• see

http://shareware.unc.edu/software.html For use with X-Windows Display:

• ssh –X topsail.unc.edu or ssh –X kure.unc.edu

• ssh –Y topsail.unc.edu or ssh –Y kure.unc.edu

Off-campus users (i.e. domains outside of unc.edu) must use VPN connection

Page 26: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

File SpacesFile Spaces

Page 27: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

27

Topsail File SpaceTopsail File Space

Home directories• /ifs1/home/<onyen>• anyone over 15 GB is not backed up

Scratch Space• /ifs1/scr/<onyen>• over 39 TB of scratch space• run jobs with large output in this space

Mass Storage• ~/ms

Page 28: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

28

Kure File SpaceKure File Space

Home directories• /nas02/home/<a>/<b>/<onyen>

a = first letter of onyen, b = second letter of onyen

• hard limit of 15 GB

Scratch Space – still evolving• /nas – to be upgraded to 15 TB• /largefs – to be upgraded to 30 TB• run jobs with large output in these spaces

Mass Storage• ~/ms

Page 29: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

29

Mass StorageMass Storage

“To infinity … and beyond” - Buzz Lightyear

long term archival storage

access via ~/ms

looks like ordinary disk file system – data is actually stored on tape

“limitless” capacity

data is backed up

For storage only, not a work directory (i.e. don’t run jobs from here)

if you have many small files, use tar or zip to create a single file for better performance

Sign up for this service on onyen.unc.edu

Page 30: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

User Environment and Applications, Compiling

Code

User Environment and Applications, Compiling

Code Modules

Page 31: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

31

ModulesModules

The user environment is managed by modules

Modules modify the user environment by modifying and adding environment variables such as PATH or LD_LIBRARY_PATH

Typically you set these once and leave them

Note there are two module settings, one for your current environment and one to take affect on your next login (e.g. batch jobs running on compute nodes)

Page 32: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

32

Common Module Commands

Common Module Commands

module avail• module avail

apps

module help

module list module add module rm

Login version

module initlist module initadd module initrm

More on modules see http://help.unc.edu/CCM3_006660

Page 33: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

33

Parallel Jobs with MPIParallel Jobs with MPI

There are three implementations of the MPI standard installed:• mvapich• mvapich2 (currently only on topsail)• openmpi

Performance is similar for all three, all three run on the IB fabric. Mvapich is the default. Openmpi and mvapich2 have more the the MPI-2 features implemented.

Page 34: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

34

Compiling MPI programs

Compiling MPI programs

Use the MPI wrappers to compile your program• mpicc, mpiCC, mpif90, mpif77• the wrappers will find the appropriate

include files and libraries and then invoke the actual compiler

• for example, mpicc will invoke either gcc or icc depending upon which module you have loaded

Page 35: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

35

Compiling on Topsail/Kure

Compiling on Topsail/Kure

Serial Programming• Intel Compiler Suite for Fortran77, Fortran90, C and

C++, - Recommended by Research Computing icc, icpc, ifort

• GNU gcc, g++, gfortran

Parallel Programming• MPI (see previous page)• OpenMP

Compiler tag: -openmp for Intel -fopenmp for GNU

Must set OMP_NUM_THREADS in submission script

Page 36: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

36

Debugging - Totalview

Debugging - Totalview

If you are debugging code there is a powerful commercial debugger, totalview

See http://help.unc.edu/CCM3_021717 parallel and serial code Fortran/C/C++ GUI for source level control too many features to list!

Page 37: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

Job Scheduling and ManagementJob Scheduling and Management

Page 38: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

38

What does a Job Scheduler and batch system do?

What does a Job Scheduler and batch system do?

Manage Resources allocate user tasks to resource monitor tasks process control manage input and output report status, availability, etc enforce usage policies

Page 39: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

39

Job Scheduling Systems

Job Scheduling Systems

Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc.

Many types of schedulers• Load Sharing Facility (LSF) – Used by

Topsail/Kure• IBM LoadLeveler• Portable Batch System (PBS)• Sun Grid Engine (SGE)

Page 40: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

40

LSFLSF

All Research Computing clusters use LSF to do job scheduling and management

LSF (Load Sharing Facility) is a (licensed) product from Platform Computing• Fairly distribute compute nodes among users• enforce usage policies for established queues

most common queues: int, now, week, month• RC uses Fair Share scheduling, not first come,

first served (FCFS) LSF commands typically start with the letter

b (as in batch), e.g. bsub, bqueues, bjobs, bhosts, …• see man pages for much more info!

Page 41: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

41

Simplified view of LSFSimplified view of LSF

bsub –n 64 –a mvapich –q week mpirun myjob

Login Node

Jobs Queued

job routed to queue

job_Jjob_Fmyjobjob_7

job dispatched to run on available host which satisfies job requirements

user logged in to login node submits job

Page 42: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

42

Running Programs on Topsail

Running Programs on Topsail

Upon ssh to Topsail/Kure, you are on the Login node.

Programs SHOULD NOT be run on Login node.

Submit programs to one of the many, many compute nodes.

Submit jobs using Load Sharing Facility (LSF) via the bsub command.

Page 43: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

43

Common batch commands

Common batch commands

bsub - submit jobs bqueues – view info on defined queues

• bqueues –l week bkill – stop/cancel submitted job bjobs – view submitted jobs

• bjobs –u all bhist – job history

• bhist –l <jobID>

Page 44: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

44

Common batch commands

Common batch commands

bhosts – status and resources of hosts (nodes)

bpeek – display output of running job Use man pages to get much more info!

• man bjobs

Page 45: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

45

Submitting Jobs: bsub Command

Submitting Jobs: bsub Command

Submit Jobs - bsub• Run large jobs out of scratch space, smaller jobs

can run out of your home space

bsub [-bsub_opts] executable [-exec_opts] Common bsub options:

• –o <filename> –o out.%J

• -q <queue name> -q week

• -R “resource specification” -R “span[ptile=8]”

• -n <number of processes> used for parallel, MPI jobs

• -a <application specific esub> -a mvapich(used on MPI jobs)

Page 46: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

46

Two methods to submit jobs:

Two methods to submit jobs:

bsub example: submit the executable job, myexe, to the week queue and redirect output to the file out.<jobID> (default is to mail output)

Method 1: Command Line• bsub –q week –o out.%J myexe

Method 2: Create a file (details to follow) called, for example, myexe.bsub, and then submit that file. Note the redirect symbol, <• bsub < myexe.bsub

Page 47: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

47

Method 2 cont.Method 2 cont.

The file you submitted will contain all the bsub options you want in it, so for this example myexe.bsub will look like this#BSUB –q week

#BSUB –o out.%J

myexe

This is actually a shell script so the top line could be the normal #!/bin/csh, etc and you can run any commands you would like.• if this doesn’t mean anything to you then

nevermind :)

Page 48: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

48

Parallel Job exampleParallel Job example

Batch Command Line Method bsub –q week –o out.%J -n 64 -a mvapich

mpirun myParallelExe

Batch File Method bsub < myexe.bsub where myexe.bsub will look like this

#BSUB –q week

#BSUB –o out.%J

#BSUB –a mvapich

#BSUB –n 64

mpirun myParallelExe

Page 49: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

49

Some Topsail QueuesSome Topsail Queues

Queue Time Limit

Jobs/User CPU/Job

int 2 hrs 128 ---

debug 2 hrs 64 ---

day 24 hrs 512 4 – 128

week 1 week 512 4 – 128

512cpu 4 days 512 32 – 512

128cpu 4 days 512 32 – 128

32cpu 2 days 512 4 – 32

chunk 4 days 512 Batch Jobs• For access to the 512cpu queue the scalability must be

demonstrated

Page 50: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

50

Some Kure QueuesSome Kure Queues

Queue Time Limit

Jobs/User

int 10 hrs 2

debug 5 minutes 32

bigmem 1 week 8

week 1 week -

patrons none -Most users have a 32 job slots limit unless they have been granted extra slots.

Queues are always subject to change and probably will change as Kure production ramps up. Use the bqueues command to find the current status

Page 51: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

51

Common Error 1Common Error 1

If job immediately dies, check err.%J file err.%J file has error:

• Can't read MPIRUN_HOST

Problem: MPI enivronment settings were not correctly applied on compute node

Solution: Include mpirun in bsub command

Page 52: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

52

Common Error 2Common Error 2

Job immediately dies after submission err.%J file is blank Problem: ssh passwords and keys were

not correctly setup at initial login to Topsail Solution:

• cd ~/.ssh/• mv id_rsa id_rsa-orig• mv id_rsa.pub id_rsa.pub-orig• Logout of Topsail• Login to Topsail and accept all defaults

Page 53: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

53

Interactive JobsInteractive Jobs

To run long shell scripts on Topsail or Kure, use int queue

bsub –q int –Ip /bin/bash• This bsub command provides a prompt

on compute node• Can run program or shell script

interactively from compute node

Page 54: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

54

Specialty ScriptsSpecialty Scripts

There are specialty scripts provided on Kure for the user convenience.

Batch scripts• bmatlab, bsas, bstata

X-window scripts• xmatlab, xsas, xstata

Interactive scripts• imatlab, istata

Page 55: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

55

MPI/OpenMP TrainingMPI/OpenMP Training

Courses are taught throughout year by Research Computing• http://learnit.unc.edu/workshops• http://help.unc.edu/CCM3_008194

See schedule for next course • MPI• OpenMP

Page 56: Using Kure and Topsail Mark Reed Grant Murphy Charles Davis ITS Research Computing.

56

Further Help with Topsail/Kure

Further Help with Topsail/Kure

More details can be found on the Getting Started help documents:• http://help.unc.edu/?id=6214 - Topsail• http://help.unc.edu/ccm3_015682 - Kure• http://keel.isis.unc.edu/wordpress/ - ON CAMPUS

For assistance with Topsail/Kure, please contact the ITS Research Computing group• Email: [email protected]• Phone: 919-962-HELP• Submit help ticket at http://help.unc.edu

For immediate assistance, see manual pages• man <command>