STG WW STG WW Blue Gene Blue Gene & HPC Benchmark & HPC Benchmark Centers Centers Tutorial: Introduction to the Blue Tutorial: Introduction to the Blue Gene Facility in Rochester, Minnesota Gene Facility in Rochester, Minnesota Carlos P Sosa Carlos P Sosa Chemistry and Life Sciences Group Chemistry and Life Sciences Group Advanced Systems Software Advanced Systems Software Development Development Rochester, MN Rochester, MN
61
Embed
STG WW Blue Gene & HPC Benchmark Centers Tutorial: Introduction to the Blue Gene Facility in Rochester, Minnesota Carlos P Sosa Chemistry and Life Sciences.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Tutorial: Introduction to the Blue Gene Tutorial: Introduction to the Blue Gene Facility in Rochester, MinnesotaFacility in Rochester, Minnesota
Carlos P SosaCarlos P SosaChemistry and Life Sciences GroupChemistry and Life Sciences Group
Advanced Systems Software Advanced Systems Software DevelopmentDevelopment
Rochester, MNRochester, MN
2
Advanced Systems Software Dev.
Rochester Blue Gene Center Rochester Blue Gene Center TeamTeam
Cindy Mestad, Certified PMP®, STG WW Blue Gene & HPC p Benchmark Centers
Steve M Westerbeck, System Administrator, STG WW Blue Gene & HPC p Benchmark Centers
3
Advanced Systems Software Dev.
Chemistry Life Sciences Applications Chemistry Life Sciences Applications TeamTeam
Carlos P Sosa, Chemistry and Life Sciences Applications, Advanced Systems Software Development
4
Advanced Systems Software Dev.
PrefacePreface
This tutorial provides a brief introduction to the environment for the Blue Gene IBM Facilities in Rochester, Minnesota
Customers should be mindful of their own security issues
The following points should be considered: ► Sharing of userids is not an accepted practice in order to
maintain proper authentication controls
► Additional encryption of data and source code on the filesystem is encouraged
► Housekeeping procedures on your assigned frontend node and filesystem is recommended
► Report any security breaches or concerns to the Rochester Blue Gene System Administration
► Changing permissions on user generated files for resource sharing is the responsibility of the individual user
► Filesystem cleanup at the end of the engagement is the responsibility of the customer
5
Advanced Systems Software Dev.
1. Blue Gene Hardware Overview1. Blue Gene Hardware Overview
6
Advanced Systems Software Dev.
Blue Gene System ModularityBlue Gene System Modularity
How is BG/P How is BG/P Configured?Configured?
Service & Front End (Login) Nodes
10GbE Functional NetworkFile
Servers
1GbE Service Network
SLES10DB2XLF
XLC/C++GPFSESSL
TWS LL
Storage Subsystem
7
Advanced Systems Software Dev.
HierarchyHierarchy
Compute nodes dedicated to running user applications, and almost nothing else – simple compute node kernel (CNK)
I/O nodes run Linux and provide a more complete range of OS services – files, sockets, process launch, debugging, and terminationService node performs system management services (e.g., heart beating, monitoring errors) – largely transparent to application/system softwareLooking inside Blue Gene
BG/P Applications Specific BG/P Applications Specific Integrated Circuit (ASIC) DiagramIntegrated Circuit (ASIC) Diagram
Virtual Node Mode
Previously called Virtual Node ModeAll four cores run one MPI process eachNo threadingMemory / MPI process = ¼ node memoryMPI programming model
Dual Node ModeTwo cores run one MPI process eachEach process may spawn one thread on core not used by other processMemory / MPI process = ½ node memoryHybrid MPI/OpenMP programming model
SMP Node ModeOne core runs one MPI processProcess may spawn threads on each of the other coresMemory / MPI process = full node memoryHybrid MPI/OpenMP programming model
M
P
M
P
M
P
Memory address space
M
Co
re 0
P
Application
Co
re 1
Co
re 2
Co
re 3
Application
M
P
T
M
P
TCo
re 0
Co
re 1
Co
re 2
Co
re 3
Memory address space
CPU2 CPU3
Application
M
P
T T TCo
re 0
Co
re 1
Co
re 2
Co
re 3
Memory address space
What’s new?Blue Gene/P Job Modes Allow Flexible Use of Blue Gene/P Job Modes Allow Flexible Use of
Node MemoryNode Memory
Blue Gene Integrated NetworksBlue Gene Integrated Networks
– Torus– Interconnect to all
compute nodes– Torus network is used– Point-to-point
communication– Collective
– Interconnects compute and I/O nodes
– One-to-all broadcast functionality
– Reduction operations functionality
– Barrier– Compute and I/O nodes– Low latency barrier
across system (< 1usec for 72 rack)
– Used to synchronize timebases
– 10Gb Functional Ethernet
– I/O nodes only
– 1Gb Private Control Ethernet
– Provides JTAG, i2c, etc, access to hardware. Accessible only from Service Node system
– Boot, monitoring, and diagnostics
– Clock network
– Single clock source for all racks
HPC Software Tools for Blue Gene HPC Software Tools for Blue Gene IBM Software Stack
XL (FORTRAN, C, and C++) compilers Externals preserved Optimized for specific BG functions OpenMP support
LoadLeveler scheduler Same externals for job submission and system
query functions Backfill scheduling to achieve maximum
system utilization
GPFS parallel file system Provides high performance file access, as in
current pSeries and xSeries clusters Runs on I/O nodes and disk servers
ESSL/MASSV libraries Optimization library and intrinsics for better
application performance Serial Static Library supporting 32-bit
applications Callable from FORTRAN, C, and C++
MPI library Message passing interface library, based on
MPICH2, tuned for the Blue Gene architecture
Other Software Support Parallel File Systems
Lustre at LLNL, PVFS2 at ANL
Job Schedulers SLURM at LLNL, Cobalt at ANL Altair PBS Pro, Platform LSF (for BG/L only) Condor HTC (porting for BG/P)
Parallel Debugger Etnus TotalView (for BG/L as of now, porting for
BG/P) Allinea DDT and OPT (porting for BG/P)
Libraries FFT Library - Tuned functions by TU-Vienna VNI (porting for BG/P)
bcssh gateway:►/codhome/userid directories on bcssh are
limited to 300GB (shared, no quota)
– Used for transferring files in and out of the environment
Frontend node:►/home directories have 10GB for all users,
no quotas
►The /gpfs file system is 400GB in size, there are no quotas as the file space is shared between all users on that frontend node
27
Advanced Systems Software Dev.
4. Compilers for Blue Gene4. Compilers for Blue Gene
28
Advanced Systems Software Dev.
IBM CompilersIBM Compilers
Compilers for Blue Gene are located in the front-end (/opt/ibmcmp)
Fortran:► /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf
► /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf90
► /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf95 C:
► /opt/ibmcmp/vac/bg/9.0/bin/bgxlc C++:
► /opt/ibmcmp/vacpp/bg/9.0/bin/bgxlC
29
Advanced Systems Software Dev.
GNU CompilersGNU Compilers
The Standard GNU compilers and libraries which are also located on the frontend node will NOT produce Blue Gene compatible binary code. The standard GNU compilers can only be used for utility or frontend code development that your application may require.
GNU compilers (Fortran, C, C++) for Blue Gene are located in (/opt/blrts-gnu/ )
Fortran: ► /opt/gnu/powerpc-bgp-linux-gfortran
C: ► /opt/gnu/powerpc-bgp-linux-gcc
C++: ► /opt/gnu/powerpc-bgp-linux-g++
It is recommended not to use GNU compiler for Blue Gene as the IBM XL compilers offer significantly higher performance. The GNU compilers do offer more flexible support for things like inline assembler.
30
Advanced Systems Software Dev.
5. MPI on Blue Gene5. MPI on Blue Gene
31
Advanced Systems Software Dev.
MPI Library LocationMPI Library Location
MPI implementation on Blue Gene is based on MPICH-2 from Argonne National Laboratory
Include files mpi.h and mpif.h are at the location:
►-I/bgsys/drivers/ppcfloor/comm/include
32
Advanced Systems Software Dev.
6 & 7. Compilation and Execution on 6 & 7. Compilation and Execution on Blue GeneBlue Gene
33
Advanced Systems Software Dev.
Copying Executables and InputCopying Executables and Input
Step 1 : Copy Input files and executables to a shared directory► Place data and executables in a directory under /gpfs
-np : Number of processors to be used. Must fit in available partition-partition : A partition from Blue Gene rack on which a given executable will execute,
eg., R000.-cwd : The current working directory and is generally used to specify where any input
and output files are located.-exe : The actual binary program which user wish to execute.
mpirun section: specific to the applicationmpirun section: specific to the application
47
Advanced Systems Software Dev.
mpirunmpirun Standalone Versus Standalone Versus mpirunmpirun in LL in LL EnvironmentEnvironment
•Comparison between mpirun and Loadleveler llsubmit command commandComparison between mpirun and Loadleveler llsubmit command command
job_type and requirements tags must ALWAYS be specified as listed above
If the above command file listing were contained in a file named my_job.cmd, then the job would then be submitted to the LoadLeveler queue using llsubmit my_job.cmd.
48
Advanced Systems Software Dev.
Blue Gene – Monitoring Jobs: Blue Gene – Monitoring Jobs: bgstatusbgstatus
Monitor Status of job executing on Blue Gene► $bgstatus
49
Advanced Systems Software Dev.
Blue Gene – Monitoring Jobs: Blue Gene – Monitoring Jobs: lljobqlljobq