Top Banner
NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics Morning only coffee and snacks Additional drinks $0.50 in refrigerator in small kitchen area; can easily go out to get coffee during 15-minute breaks Parking garage vouchers at reception desk on second floor Lunch On your own, but can go out in groups
23

NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

Jan 11, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

NUG Training 10/3/2005

• Logistics– Morning only coffee and snacks– Additional drinks $0.50 in refrigerator in

small kitchen area; can easily go out to get coffee during 15-minute breaks

– Parking garage vouchers at reception desk on second floor

• Lunch– On your own, but can go out in groups

Page 2: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Today’s Presentations

• Jacquard Introduction

• Jacquard Nodes and CPUs

• High Speed Interconnect and MVAPICH

• Compiling

• Running Jobs

• Software overview

• Hands-on

• Machine room tour

Page 3: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

Overview of Jacquard

Richard GerberNERSC User [email protected]

NERSC User’s GroupOctober 3, 2005

Oakland, CA

Page 4: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Presentation Overview

• Cluster overview• Connecting• Nodes and processors• Node interconnect• Disks and file systems• Compilers• Operating system• Message passing interface• Batch system and queues• Benchmarks and application performance

Page 5: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Status

• Status Update

Jacquard has been experiencing node failures.While this problem is being worked on we aremaking Jacquard available to users in a degraded mode.About 200 computational nodes are available, one login node, and about half of the storage nodes that support the GPFS file system.Expect lower than usual I/O performance.Because we may still experience some instability, users will not be charged until Jacquard is returned to full production

Page 6: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Introduction to Jacquard

• Named in honor of inventor Joseph Marie Jacquard, whose loom was the first machine to use punch cards to control a sequence of operations.

• Jacquard is a 640-CPU Opteron cluster running a Linux operating system.

• Integrated, delivered, and supported by Linux Networx

• Jacquard has 320 dual-processor nodes available for scientific calculations. (Not dual-core processors.)

• The nodes are interconnected with a high-speed InfiniBand network.

• Global shared file storage is provided by a GPFS file system.

Page 7: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Jacquard

http://www.nersc.gov/nusers/resources/jacquard/

Page 8: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Jacquard Characteristics

Processor type Opteron 2.2 GHz

Processor theoretical peak 4.4 GFlops/sec

Processors per node 2

Number of application nodes/processors

320 / 640

System theoretical peak (computational nodes)

2.8 TFlops/sec

Physical memory per node (usable) 6 (3-5) GBytes

Number of spare application nodes 4

Number of login nodes 4

Node interconnect InfiniBand

Global shared disk GPFS: 30 TBytes usable

Batch system PBS Pro

Page 9: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Jacquard’s Role

• Jacquard is meant to be for codes that do not scale well on Seaborg.

• Hope to relieve Seaborg backlog.

• Typical job expected to be in the concurrency range of 16-64 nodes.

• Applications typically run 4X Seaborg speed. Jobs that cannot scale to large parallel concurrency should benefit from faster CPUs.

Page 10: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Connecting to Jacquard

• Interactive shell access is via SSH.• ssh [–l login_name] jacquard.nersc.gov

• Four login nodes for compiling and launching parallel jobs. Parallel jobs do not run on login nodes.

• Globus file transfer utilities can be used.• Outbound network services are open (e.g.,

ftp).• Use hsi for interfacing with HPSS mass

storage.

Page 11: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Nodes and processors

• Each jacquard node has 2 processors that share 6 GB of memory. OS/network/GPFS uses ~1 (?) GB of that.

• Each processor is a 2.2 GHz AMD Opteron• Processor theoretical peak: 4.4

GFlops/sec• Opteron offers advanced 64-bit processor,

becoming widely used in HPC.

Page 12: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Node Interconnect

• Nodes are connected by an InfiniBand high speed network from Mellanox.

• Adapters and switches from Mellanox

• Low latency: ~7µs vs. ~25 µs on Seaborg

• Bandwidth ~ 2X Seaborg

• “Fat tree”

Page 13: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Disks and file systems

• Homes, scratch, and project directories are in global file system from IBM, GFPS.

• $SCRATCH environment variable is defined to contain path to a user’s personal scratch space.

• 30 TBytes total usable disk– 5 GByte space, 15,000 inode quota in $HOME per

user– 50 GByte space, 50,000 inode quota in

$SCRATCH per user

• $SCRATCH gives better performance, but may be purged if space is needed

Page 14: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Project directories

• Project directories are coming (some are already here).

• Designed to facilitate group sharing of code and data.

• Can be repo- or arbitrary group-based• /home/projects/group

– For sharing group code

• /scratch/projects/group– For sharing group data and binaries

• Quotas TBD

Page 15: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Compilers

• High performance Fortran/C/C++ compilers from Pathscale.

• Fortran compiler: pathf90• C/C++ compiler: pathcc, pathCC• MPI compiler scripts use Pathscale

compilers “underneath” and have all MPI –I, -L, -l options already defined:– mpif90– mpicc– mpicxx

Page 16: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Operating system

• Jacquard is running Novell SUSE Linux Enterprise Linux 9

• Has all the “usual” Linux tools and utilities (gcc, GNU utilities, etc.)

• It was the first “enterprise-ready” Linux for Opteron.

• Novell (indirectly) provides support and product lifetime assurances (5 yrs).

Page 17: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Message passing interface

• MPI implementation is known as “MVAPICH.”

• Based on MPICH from Argonne with additions and modifications from LBNL for InfiniBand. Developed and supported ultimately by Mellanox/Ohio State group.

• Provides standard MPI and MPI/IO functionality.

Page 18: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Batch system

• Batch scheduler is PBS Pro from Altair

• Scripts not much different from LoadLeveler: #@ -> #PBS

• Queues for interactive, debug, premium charge, regular charge, low charge.

• Configured to run jobs using 1-128 nodes (1-256 CPUs).

Page 19: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Performance and benchmarks

• Applications run 4x Seaborg, some more, some less

• NAS Parallel Benchmarks (64-way) are ~ 3.5-7 times seaborg

• Three applications the author has examined: (“-O3 out of the box”):– CAM 3.0 (climate): 3.5 x Seaborg– GTC (fusion): 4.1 x Seaborg– Paratec (materials): 2.9 x Seaborg

Page 20: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

User Experiences

• Positives–Shorter wait in the queues–Linux; many codes already run

under Linux–Good performance for 16-48

node jobs; some codes scale better than on Seaborg

–Opteron is fast

Page 21: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

User Experiences

• Negatives– Fortran compiler is not common, so

some porting issues.– Small disk quotas.– Unstable at times.– Job launch doesn’t work well (can’t

pass ENV variables).– Charge factor.– Big endian I/O.

Page 22: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Today’s Presentations

• Jacquard Introduction

• Jacquard Nodes and CPUs

• High Speed Interconnect and MVAPICH

• Compiling

• Running Jobs

• Software overview

• Hands-on

• Machine room tour

Page 23: NERCS Users’ Group, Oct. 3, 2005 NUG Training 10/3/2005 Logistics –Morning only coffee and snacks –Additional drinks $0.50 in refrigerator in small kitchen.

NERCS Users’ Group, Oct. 3, 2005

Hands On

• We have a special queue “blah” with 64 nodes reserved.

• You may work on your own code.• Try building and running test code

– Copy to your directory and untar /scratch/scratchdirs/ragerber/NUG.tar

– 3 NPB parallel benchmarks: ft, mg, sp– Configure in config/make.def– make ft CLASS=C NPROCS=16– Sample PBS scripts in run/– Try new MPI version, opt levels, -g, IPM