Who uses a supercomputer anyway? Jim Greer
Mar 26, 2015
Who uses a supercomputer anyway?
Jim Greer
Start of Digital Computer: ENIAC
Built in 1943-45 at the Moore School of the University of Pennsylvania for the War effort by John Mauchly and J. Presper Eckert.
The Electronic Numerical Integrator And Computer (ENIAC) was one of the first general-purpose electronic digital computers.
Programming the Computer
Programming in early computers is by wiring the cables and flipping the switches.
MANIAC
The MANIAC had a memory of 1K 40-bit words.
Multiplication took a milli-second.
THE JOURNAL OF CHEMICAL PHYSICS VOLUME 21, NUMBER 6 JUNE, 1953
Equation of State Calculations by Fast Computing Machines
NICHOLAS METROPOLIS, ARIANNA W. ROSENBLUTH, MARSHALL N. ROSENBLUTH, AND AUGUSTA H. TELLER,
Los Alamos Scientific Laboratory, Los Alamos, New Mexico
AND
EDWARD TELLER, * Department of Physics, University of Chicago, Chicago, Illinois
(Received March 6, 1953)
A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two-dimensional rigid-sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four-term virial coefficient expansion.
1087This algorithm by Metropolis et al, from1953 has been cited as among the top 10 algorithms having the "greatest influence on the development and practice of science and engineering in the 20th century."
Machine is long out of date-the methods and scientific approach used remain relevant today.
Basis for simulated annealing
Year Computer Name Power (Watts) Performance (adds/sec) Memory (kByte) Price
(US dollars)
1951 UNIVAC I 124,500 1,900 48 $1,000,000
1964 IBM S360 10,000 500,000 64 $1,000,000
1965 PDP-8 500 330,000 4 $16,000
1976 Cray-1 60,000 166,000,000 32,768 $4,000,000
1981 IBM PC 150 240,000 256 $3,000
1991 HP 9000 500 50,000,000 16,384 $7,400
2005 IBM notebook 20 1,000,000,000 512,000 $1,900
What do the machines look like nowadays?
http://news.bbc.co.uk/1/hi/technology/6128066.stm
When HPCx first came into service in 2002 it was one of the top 10 fastest supercomputers in the world. Despite an upgrade in 2004, it has since slipped to 59th place in the Top 500 supercomputers list.
At the moment the most potent machine in the world is the IBM's Blue Gene/L at the Lawrence Livermore National Laboratory in California where it is used to ensure that the US nuclear weapons stockpile remains safe and reliable.
… only machine to have pushed through the 100 teraflop barrier, performs a staggering 280.6 trillion calculations per second …. 367 teraflops
Seymour Cray - plumber
Beowulf is perhaps the most well-known type of parallel processing cluster today. Donald Becker and Thomas Sterling designed the first Beowulf prototype in 1994 for NASA. It consisted of 16 486-DX4 processors connected by channel-bonded Ethernet. The next Beowulf clusters were built around 16 Pentium Pro (P6) 200-MHz processors connected by Fast Ethernet adapters and switches.
Beowulf is a design for high-performance parallel computing clusters on inexpensive personal computer hardware. Originally developed at NASA, Beowulf systems are now
deployed worldwide, chiefly in support of scientific computing.
A Beowulf cluster is a group of usually identical PC computers running a Free and Open Source Software (FOSS) Unix-like operating system, such as Linux or BSD. They are
networked into a small TCP/IP LAN, and have libraries and programs installed which allow processing to be shared among them.
There is no particular piece of software that defines a cluster as a Beowulf. Commonly used parallel processing libraries include MPI (Message Passing Interface) and PVM (Parallel Virtual Machine). Both of these permit the programmer to divide a task among a group of
networked computers, and recollect the results of processing. - Wikipedia
From PlayStation to Supercomputer for $50,000New York Times 2003-05-26 | By JOHN MARKOFF
As perhaps the clearest evidence yet of the computing power of sophisticated but inexpensive video-game consoles, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign has assembled a supercomputer from an army of Sony PlayStation 2's. The resulting system, with components purchased at retail prices, cost a little more than $50,000. The center's researchers believe the system may be capable of a half trillion operations a second, well within the definition of supercomputer, although it may not rank among the world's 500 fastest supercomputers.
Interconnect—Lowest measured latency (smaller better)
PathScale InfiniPath— 1.31 microseconds Cray RapidArray— 1.63 microseconds Quadrics— 4.89 microseconds NUMAlink— 5.79 microseconds Myrinet— 19.00 microseconds Gigabit Ethernet— 42.23 microseconds Fast Ethernet— 603.15 microseconds
Source: HPC Challenge, November 2005.
Latency and Bandwidth
Commodity processsors → the interconnect “makes” the supercomputer
TOP 10 Sites for June 2006
Site System Family # ProcessorsDOE/NNSA/LLNLUnited States
Blue Gene L
IBM
131 072
IBM Watson
United States
Blue Gene L
IBM
40 960
DOE/NNSA/LLNL
United States
ASCPurple (p –series)
IBM
12 208
NASA/Ames
United States
Altix
SGI
10 160
CEA
France
Tera-10 (SMP cluster)
Bull
8 704
Sandia Nat. Lab.
United States
PowerEdge
Dell
9 024
Tokyo Inst. Tech.
Japan
Sun Fire
NEC/Sun
10 368
FZ Juelich
Germany
Blue Gene L
IBM
16 384
Sandia Nat. Lab.
United States
XT3
Cray
10 880
Earth Simulation
Japan
NEC Vector
NEC
5 120
www.top500.org
Scientific Computing
Newton’s equation (with damping)
Diffusion or heat equation
Fourier transform
Schrödinger equation
Computational Needs- Biology & Bioinformatics
Problem Component Computing Speed Storage
Genome Assembly>10 TeraFlops sustained to
keep up with expected sequencing rates
300 TB of trace files per genome
Protein Structure Prediction >100 TeraFlops per protein set in one microbial genome Petabytes
Classical Molecular Dynamics 100 TeraFlops per DNA-protein interaction 10s of Petabytes
First Principles Molecular Dynamics
1 PetaFlops per reaction in enzyme active site 100s of Petabytes
Simulations of Biological Networks
>1 TeraFlops for simple correlation analyses of
small biological networks 1000s of Petabytes
Grand Challenges
Grand Challenges are the leading problems in science and engineering that can be solved only with the help of the
fastest, most powerful computers.
They address issues of great societal impact, such as
biomedicine, the environment,
economic competitiveness, and military.
Examples of Grand Challenge problems
Forecasting weather, predicting global climate change, and modeling pollution dispersion
Determining molecular, atomic, and nuclear structures
Understanding the structure of biological macromolecules
Improving communication networks for research and education
Developing and understanding the nature of new materials
Building more energy-efficient cars and airplanes
Understanding how galaxies are formed
Designing new pharmaceuticals
Blood flow in heart from Navier-Stokes equation, NIH
Brain Chemistry: bilayer sandwich of lipids — nitrogen (blue) and phosphorous (gold) heads facing outward on both sides of filamentary tails (gray). Patti Reggio, University of North Carolina, GreensboroDiane Lynch, Kennesaw State University
This thin-slice snapshot through the simulation volume, about 3 million light years thick by 4.5 billion light years on each side, shows the filamental structure of dark-matter clusters. Brightness corresponds to density.
Paul Bode and Jeremiah Ostriker of Princeton University
The infant universe hatching from its structureless shell. This map represents Edmund Bertschinger's simulations on the CRAY T3D at Pittsburgh. This map shows negative (blue) and positive (red) fluctuations of 0.0002 degrees K. The simulation assumed a mixed hot and cold dark matter model with 5 eV neutrino mass.
Von Mises stress on the F-16 structure (increasing from dark blue to light blue to red) during an aeroelastic simulation.
Mach contours and streamlines for an aeroelastic simulation at Mach 0.9.
Kelvin K. Droegemeier, University of Oklahoma at Norman.
Animation of a simulated earthquake in the San Fernando valley. Color depicts the peak magnitude of ground displacement. The
simulation covered a 54 x 33 kilometer area, superimposed here on a satellite view of topography, to a depth of 15 km. Los Angeles
can be seen in the southeast corner.
Carnegie Mellon University
Oxygen and Temperature in Turbine-Combustion
These snapshots from simulation of turbine-combustion show how oxygen decreases (previous) and temperature increases (this slide) as markers of combustion in turbineflow. DOE and Westinghouse
Adsorption of molecules at SAMs by theoretical/simulation modelling
Matrix Diagonalization
VectorScreening
Matrix Diagonalization
VectorScreening
Matrix Diagonalization
VectorScreening
Vector merge
MatrixGeneration
MatrixGeneration
MatrixGeneration
Convergence?
Stop
Vector Expansion
Vector Expansion
Vector Expansion
Initial Guess
no
yes
Parallel computing allows you todo calculations in new ways …
Conclusions …
• Lotsa uses for supercomputers
• Commodity processors, but a lot of ‘em
• Very fast customised interconnects
• Single winning architecture not decided
• Power remains a big concern- back to plumbing