BLUE GENE Sunitha M. Jenarius
BLUE GENE
Sunitha M. Jenarius
What is Blue Gene
A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting a large memory space
With standard compilers
and message passing
environment
Why the name “Blue Gene”?
“Blue”: The corporate color of IBM “Gene”: The intended use of the Blue Gene
clusters – Computational biology, specifically, protein folding
History
Dec’99, IBM Research announced $100M US effort to build a Petaflop scale supercomputer.
Two goals of The Blue Gene project :– Massively parallel machine architecture and software – Bio-Molecular Simulation – advance orders of magnitude
November 2001, Partnership with Lawrence Livermore National Laboratory (LLNL)
and this resulted in …
Results
Linpack Top 500 Supercomputers
Blue Gene Projects
Four Blue Gene projects : – BlueGene/L – BlueGene/C – BlueGene/P – BlueGene/Q
Blue Gene/L
The first computer in the Blue Gene series
IBM first announced the Blue Gene/L project, Sept. 29, 2004
Final configuration was launched in October 2005
Blue Gene/L - Unsurpassed Performance
Designed to deliver the most performance per kilowatt of power consumed
Theoretical peak performance of 360 TFLOPS
Final Configuration (Oct. ‘05) scores over 280 TFLOPS sustained on the Linpack benchmark.
Nov 14, ‘06, at Supercomputing 2006, Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.
Blue Gene/L Architecture
Can be scaled up to 65,536 compute or I/O nodes, with 131,072 processors
Each node is a single ASIC with associated DRAM memory chips
Each ASIC has 2 700 MHz IBM PowerPC processors
PowerPC processors– Low-frequency, low-power embedded processors,
superior to today's high-frequency, high-power microprocessors by a factor of 2 or more
Blue Gene/L Architecture contd…
– Double-pipeline-double-precision Floating Point Unit– A cache sub-system with built-in DRAM controller
Node CPUs are not cache coherent with one another FPUs and CPUs are designed for low power
consumption– Using transistors with low leakage current – Local clock gating– Putting the FPU or CPU/FPU pair to sleep
Blue Gene/L Architecture contd…
1024 nodes
System Overview
Blue Gene/L Architecture contd…
1 rack holds 1024 nodes or 2048 processors Nodes optimized for low power consumption ASIC based on System-on-a-chip technology
– Large numbers of low-power system-on-a-chip technology allows it to outperform commodity clusters while saving on power
– Aggressive packaging of processors, memory and interconnect
– Power Efficient & Space Efficient– Allows for latencies and bandwidths that are significantly
better than those for nodes typically used in ASC scale supercomputers
Blue Gene/L Networks
Each node is attached to 3 main parallel communication networks– 3D Torus network - peer-2-peer between compute
nodes– Collective network – collective & global
communication – Ethernet network - I/O and management (such as
access to any node for configuration, booting and diagnostics )
Blue Gene/L System Software
System software supports efficient execution of parallel applications
Compiler support for DFPU (C, C++, Fortran) Compute nodes use a minimal operating system
called “BlueGene/L compute node kernel”– A lightweight, single-user operating system – Supports execution of a single dual-threaded application
compute process– Kernel provides a single and static virtual address space to
one running compute process– Because of single-process nature, no context switching
required
Blue Gene/L System Software contd…
To allow multiple programs to run concurrently – Blue Gene/L system can be partitioned into electronically
isolated sets of nodes – The number of nodes in a partition must be a positive
integer power of 2 – To run program – reserve this partition– No other program can use till partition is done with current
program– With so many nodes, component failures are inevitable. The
system is able to electrically isolate faulty hardware to allow the machine to continue to run
Blue Gene/L System Software contd…
Parallel Programming model– Message Passing – supported through an
implementation of MPI– Only a subset of POSIX calls are supported – Green threads are also used to simulate local
concurrency
Blue Gene/C
Sister-project to BlueGene/L Renamed to Cyclops64 Massively parallel, supercomputer-on-a-chip
cellular architecture Cellular architecture gives the programmer
the ability to run large numbers of concurrent threads within a single processor.
Blue Gene/P
Architecturally similar to BlueGene/L Expected to operate around one petaflop Expected around 2008
Blue Gene/Q
Last known supercomputer in the Blue Gene series
Expected to reach 3-10 petaflops
Resources
Wikipedia.org IBM website
– (www.03.ibm.com/servers/deepcomputing/bluegene.html)
www.supercomp.org/sc2002/paperpdfs/pap.pap207.pdf