1 Dr. Virendrakumar (Virendra) C. Bhavsar Professor Faculty of Computer Science University of New Brunswick (UNB) Fredericton, Canada Supercomputing
Jan 01, 2016
1
Dr. Virendrakumar (Virendra) C. Bhavsar
Professor
Faculty of Computer ScienceUniversity of New Brunswick (UNB)
Fredericton, Canada
Supercomputing
2
Outline
• Definitions
Applications
• Hardware
• Software
• Current Status
• University of New Brunswick
• Future
3
Computing
Supercomputing
- A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.
High Performance Computing (HPC)/High Productivity Computing
- supercomputing - a subset of HPC
Parallel Computing- many calculations are carried out simultaneously
10**6 Million, 10**9 Billion, 10**12 Trillion
Definitions
4
10**10 Neurons 10**4 Fan-in
- Wires much slower than chips - Millions of times more volume
10**14 Inputs (Connection strngths
10**12 Connection strengths can affect processing in 5 msec
Lower bound on the computational power of brain
~ 10**10 neurons, 10 spikes/sec, 10**14 connections
~10**15 operations/sec or 10**18 bits/sec
Human Brain
5
65K Processors, 5 CM-2 = 1.8 x 10**13 bits/sec
10**5 times slower than brain
Connection Machine CM-2
Early Computers
1950: 5,000 operations/sec; 1970-71: 1 Million Operations/sec
7
1974 - 1 MHz clock1988 – 40 MHz2002 – 2 GHz2009 – P4 3.0 GHz, Quadcore 2.66 MHz
Intel Montecito chip1.72 Billion transistors NVidia 280 series GPU 1.4 Billion transistors
- Circuit complexity doubles every 18 months Computing power at a given cost doubles every 18
months
- Processor clock rates: 40% increase/year + more instr./cycle
- DRAM Access Times: 10% increase/year caches required
Advances in Microprocessor Technology
8
Grand Challenge Applications
- cannot be solved in a reasonable amount of time with today's computers
- Environment, Ecosystems, Molecular engineering, cognition, weapon design, Artificial Intelligence,
(near) Real-Time Applications
- Military/Defense Applications
- Space
-Financial Forecasting; Live data (e.g. online stock market data)
Applications
9
(near) Real-Time Applications
- Software as a Service (SaaS) delivery model
-ATMs, online banking
Data Intensive Applications
-Walmart – inventory management
- Data Mining
Applications
10
Computational Modeling and Simulation
- Science, Engineering, Social Sciences, …
-Parameter sweep applications
Animation and Movies
Applications
11
Compute Intensive Applications
Massive Data applications
Applications
12
Capability Computing
- Using the maximum computing power to solve a large problem in the shortest amount of time
Capacity computing
- Using efficient cost-effective computing power to solve
- somewhat large problems
- many small problems
Applications
13
Cooling
Speed of Light
Compute Bound Problems I/O Bound problems
Supercomputer Design Challenges
14
Pipelining and Vector Processing
Parallel and Distributed Processing
Liquid Cooling
Non-Uniform Memory Access
Striped Disks (RAID)
Parallel File System
Supercomputer Technologies
15
- Intrinsic parallelism
- Design of parallel algorithms
- Analysis of parallel algorithms
Parallel and Distributed Algorithms
16
PVM and MPI – Loosely connected clusters
OpenMP for Shared Memory Machines
Programming
17
Compilers
Limited success
Automatic Parallelization
Application Checkpointing
18
Roadrunner applications
- National Security
- Planet: Earth and Environmental Sciences
e.g. ground water modeling
- Health: Biology, Chemistry, Life Sciences
- Science: Engineering, Technology
- Universe: Astronomy, Space, Astrophysics
-- Modeling the decay of the US nuclear arsenal
Current Supercomputer
19
Roadrunner
Los Alamos National Laboratory, Los Alamos, NM, USA
- >1 Petaflop (Quadrilion): million billion (10**15) floating-point operations/sec (FLOPS)
-1.71 Petaflop peak
- Weight - 500,000 pounds
- Power - 4 Mega Watt
- Space – 6000 square feet
- Cabling 57 miles
-
Current Supercomputer
20
Roadrunner (Installation Year – 2008)
Los Alamos National Lab, USA
~ 3,250 compute nodes
-Compute Node: Two AMD Opteron dual-core microprocessors
- Each of the Opteron core: Internally attached to one of four enhanced Cell microprocessors.
- Enhanced Cell: double-precision arithmetic faster and can access more memory than can the original Cell in a PlayStation 3. The entire machine will have almost 13,000 Cells and half as many dual-core Opterons.
- Interconnection Network: off-the-shelf Infiniband
Current Supercomputer
21
Roadrunner (Installation Year – 2008)
DOE/NNSA/LANL
System Family - IBM Cluster
System Model - BladeCenter QS22 Cluster
Computer - BladeCenter QS22/LS21 Cluster, PowerXCell 8i 3.2 Ghz / Opteron DC 1.8 GHz , Voltaire Infiniband
Operating System - Linux
Interconnect – Infiniband
Processor - PowerXCell 8i 3200 MHz (12.8 GFlops)
Current Supercomputer
22
Hardware: Building Blocks
• Building blocks – processors, memory, interconnection networks• Processors• Memory – main and secondary storage• Interconnection networks
23
Hardware: Architectures
• Taxonomy: SISD, SIMD, MISD and MIMD• Shared Memory Processing versus Distributed Memory ProcessingSymmetric Multi-Processing (SMP) versus Non-Uniform Memory Access (NUMA) • Processors• Clusters•
24
Special Purpose Supercomputers
• Specially Programmed FPGA chips• Custom VLSI Chips • Reconfigurable Computing • GPUs (Graphics Processing Units)
25
University of
New Brunswick
High Performance Computing and Networking @
University of New Brunswick
“People, Research, Excellence”
ACEnet: Atlantic Computational Excellence Network
Hosting sites:
Member sites:
ACEnet
Atlantic Canada is a distributed environment
$30 million initiative
Waterways make networking solutions difficult (e.g. Cabot Strait)
ACEnet
World-class HPC facilities
Behave as a single, regionally distributed “computational power grid”
Create and operate sophisticated collaboration facilities to bind together geographically dispersed research communities.
Advaced Computational Research Lab (ACRL) Infrastructure
UNB BiologyGary Saunders
UNB ChemistryScott BrownridgeLarry CalhounGhislain DeslongchampsFriedrich Grein
UNB Computer ScienceEric AubanelVirendra BhavsarBrad NickersonRuth Shaw
UNB Text Processing CentreAlan BurkDavid Gants
UNB GeodesyPetr VanícekRichard Langley
UNB MathematicsKeith De’BellAbraham Punnen
UNB Mechanical EngineeringMohammad Bagher AyaniDavid BonhamAndrew GerberMarwan HassanEsam Hussein
UNB PhysicsDr. Eugene K HoDr. Zong-Chao YanDr. Li-Hong Xu
UNB ForestryEvelyn Richards
UNB BiomedicalKevin Englehart
DAL PhysicsAndrew Rutenberg
MTA ChemistryStacey Wetmore
MUN Computer ScienceDwight Kuo
Sick Kids Hospital, TorontoRegis PomesChing-Hsing YuLen Zaifman
StFX Computer ScienceLaurence Yang
UofCalgary Computer SciencePeter TielemanJustin MacCallum
UdeM Environmental StudiesYves Gagnon
UdeM Computer ScienceJalal Almhana
UPEI PhysicsSheldon OppsJames Polson
UofT Computer ScienceHue Sun ChanMaria Sabaye Moghaddam
Major Users
ACEnet at UNB
Fundy: SUN cluster, AMD Opeteron, 632 cores
ACEnet: 3324 cores
Internet connectivity > 2Gbps at UNB
Collaboration Grid
Collaboration gear across Atlantic Canada Lecture rooms equipped so ACEnet sites can share
seminars and participate remotely ACEnet cafés at each site sharing continuous video
feeds Desktop level collaboration equipment for personal
communication
Access Grid streams tens to hundreds of Mbps across the CANARIE network
ACEnet
My Research Work
Special Purpose computers for Military Applications
Design and development of MICRON and PLEXUS
Parallel Monte Carlo Algorithms Graphics and Visualization PaGrid Artificial Intelligence – artificial neural networks, e-
Business Bioinformatics – Canadian Potato Genome project
Future
IBM Cyclops64 – supercomputer on a chip C-DAC initiative for 2010 –petaflop
machine NCSA, USA 2011 petaflop machine NASA, SGI and Intel Pleiades – 10
petaflop by 2012 1 Exaflop (10**18 flops) by 2019 Human brain neural simulations – 10
exaflop by 2025 2-week Full Weather modeling – 1 zeta
flops (10**21 flops) by 2030