1 Introduction: HPC goes mainstream Chokchai Box Leangsuksun Associate Professor, Computer Science Louisiana Tech University [email protected]
Dec 15, 2015
1
Introduction: HPC goes mainstream
Chokchai Box Leangsuksun
Associate Professor, Computer ScienceLouisiana Tech [email protected]
April 18, 2023
2
Outline
• Why HPC is critical technology ?
• Conclusion
April 18, 2023
3
Why HPC?
• High Performance Computing – Parallel , Supercomputing– Enabled by multiple high speed CPUs, networking, software etc –
fastest possible solution – Technologies that help solving non-trivial tasks including scientific,
engineering, medical, business entertainment and etc.
• Time to insights, Time to discovery, Times to markets
• BTW, HPC is not GRID!!!.
April 18, 2023
4
HPC Applications and Major Industries
• Finite Element Modeling – Auto/Aero
• Fluid Dynamics – Auto/Aero, Consumer Packaged Goods Mfgs,
Process Mfg, Disaster Preparedness (tsunami)
• Imaging– Seismic & Medical
• Finance – Banks, Brokerage Houses (Regression Analysis,
Risk, Options Pricing, What if, …)
• Molecular Modeling – Biotech and Pharmaceuticals
Complex Problems, Large Datasets, Long RunsComplex Problems, Large Datasets, Long Runs
This slide is from Intel presentation “Technologies for Delivering Peak Performance on HPC and Grid Applications”
April 18, 2023
5
Life Science Problem – an example of Protein Folding
• Take a computing year (in serial mode) to do molecular dynamics simulation for a protein folding problem
•Excerpted from IBM David Klepacki’s The future of HPC•Petaflop = a thousand trillion floating point operations per second
April 18, 2023
6
Disaster Preparedness - example
• Project LEAD– Severe Weather prediction
(Tornado) – OU leads. • HPC & Dynamically
adaptation to weather forecast
• Professor Seidel’s LSU CCT – Hurricane Route Prediction– Emergency Preparedness – Show Movie – HPC-enabled
Simulation
April 18, 2023
7
Did you know that Playstation 3 is a HPC/Supercomputer?
• 9 cores/CPUs in one chip. • Future gaming software is no longer graphic or multimedia only• This diagram is from an article from IBM Cell processor & compiler challenge
April 18, 2023
8
No Free Lunch (mainstream CPUs)
• CPU speed – plateaus 3-4 Ghz
• More cores in a single chip– Dual core is now– Multicore is imminent
• Traditional Applications won’t get a free rides
• Conversion to parallel computing (HPC, MT)
3-4 Ghz cap
This diagram is from “no free lunch article in DDJ
April 18, 2023
9
Cancer Gene-mining
• Unsuccessful on a uni-processor
• Our approach– Novel parallel gene-mining
algorithms– Input from microarray– Retain accuracy – Significantly speed up
(superlinear)
• IBM P5 supercomputer (128 node PPC).
0
20
40
60
80
100Bladder
Breast
Leukemia
Lung
Colorectal
Lymphoma
Melanoma
Ovary
Pancreas
Prostate
Renal
Mesothelioma
OvaMarker based Selection GeneSetMine based Selection
Time to run the algorithm, keeping number of nodes fixed
0
200
400
600
800
1000
1200
13 39 65 91
Number of processors
Tim
e ta
ken(
in s
ecs)
April 18, 2023
10
Significant indicators – why HPC now?
• No Free lunch in CPU speed up (Intel or AMD)– In past 1-2 years, CPU speed was flatten at 3+ Ghz– More CPUs in one chip – Dual core, multi-core chips– Traditional software won’t take advantage of these new processors – Personal/Desktop Supercomputing.
• Many real problems are highly computational intensive.– NSA uses supercomputing to do data mining– DOE – fusion, plasma, energy related (including weaponry).– Help solving many other important areas (nanotech, life science etc.)
• Giants recently sneeze out HPC– Bush’s state of union speech – 3 main S&T focus of which Supercomputing is one
of them– Bill Gates’ keynote speech at SC05 – MS goes after HPC
• Google search engine - 100,000 nodes• Playstation 3 is a personal supercomputing platform• Hollywood (Entertainment) is HPC-bound (Pixar – more than 3000 CPUs to
render animation)
“I propose to double the federal commitment to the most critical basic research programs in the physical sciences over the next 10 years. This funding will support the work of America's most creative minds as they explore promising areas such as nanotechnology, supercomputing, and alternative energy sources.”
Gorge W. Bush, 2005
April 18, 2023
11
HPC preparedness
• Build work forces that understand HPC paradigm & its applications– HPC/Grid Curriculum in IT/CS/CE/ICT– Offer HPC-enabling tracks to other disciplinary
(engineering, life science, physic, computational chem, business etc..)
– Training business community (e.g. HPC for enterprise ; Fluent certification, HA SLA certification)
– Bring awareness to public
• .
April 18, 2023
12
Introduction to Parallel computing
• Need more computing power– Improve the operating speed of processors & other
components• constrained by the speed of light, thermodynamic laws, &
the high financial costs for processor fabrication
– Connect multiple processors together & coordinate their computational efforts
• parallel computers• allow the sharing of a computational task among multiple
processors
April 18, 2023
13
How to Run Applications Faster ?
• There are 3 ways to improve performance:– Work Harder– Work Smarter– Get Help
• Computer Analogy–Using faster hardware
–Optimized algorithms and techniques used to solve computational tasks
–Multiple computers to solve a particular task
April 18, 2023
14
Era of Computing
– Rapid technical advances• the recent advances in VLSI technology
• software technology– OS, PL, development methodologies, & tools
• grand challenge applications have become the main driving force
– Parallel computing• one of the best ways to overcome the speed bottleneck
of a single processor
• good price/performance ratio of a small cluster-based parallel computer
April 18, 2023
15
HPC Level-setting
Definitions
• High performance computing is:– Computing that demands more than a single high-
market-volume workstation or server can deliver
• HPC is based on concurrency:– Concurrency: computing in which multiple tasks are
active at the same time
• Parallel computing occurs when you use concurrency to:– Solve bigger problems
– Solve a fixed-size problem in less time
April 18, 2023
16
HPC Level-setting
Hardware for Parallel Computing
§§SIMD has failed as a way to organize large-scale computers with multiple processors. It has SIMD has failed as a way to organize large-scale computers with multiple processors. It has succeeded, however, as a mechanism to increase instruction-level parallelism in modern succeeded, however, as a mechanism to increase instruction-level parallelism in modern microprocessors (in Intelmicroprocessors (in Intel®® MMX MMX™™ technology). technology).
Symmetric Multiprocesso
r (SMP)
Symmetric Multiprocesso
r (SMP)
Non-uniform Memory
Architecture (NUMA)
Non-uniform Memory
Architecture (NUMA)
Massively Parallel
Processor (MPP)
Massively Parallel
Processor (MPP)
Commodity
Cluster
Commodity
Cluster
Single Instruction Multiple Data (SIMD)§
Single Instruction Multiple Data (SIMD)§ Multiple Instruction
Multiple Data (MIMD)
Multiple Instruction Multiple Data (MIMD)
Parallel ComputersParallel Computers
Shared Address Space Disjoint Address Space
Distributed Computing
Distributed Computing
April 18, 2023
17
Scalable Parallel Computer Architectures
• MPP – A large parallel processing system with a shared-nothing
architecture– Consist of several hundred nodes with a high-speed
interconnection network/switch– Each node consists of a main memory & one or more processors
• Runs a separate copy of the OS
• SMP– 2-64 processors today– Shared-everything architecture– All processors share all the global resources available– Single copy of the OS runs on these systems
April 18, 2023
18
Scalable Parallel Computer Architectures• CC-NUMA
– a scalable multiprocessor system having a cache-coherent nonuniform memory access architecture
– every processor has a global view of all of the memory
• Distributed systems– considered conventional networks of independent computers– have multiple system images as each node runs its own OS– the individual machines could be combinations of MPPs, SMPs, clusters,
& individual computers
• Clusters– a collection of workstations of PCs that are interconnected by a high-
speed network– work as an integrated collection of resources – have a single system image spanning all its nodes
April 18, 2023
19
Cluster Computer and its Architecture
• A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource
• A node– a single or multiprocessor system with memory, I/O facilities, & OS – generally 2 or more computers (nodes) connected together– in a single cabinet, or physically separated & connected via a LAN– appear as a single system to users and applications – provide a cost-effective way to gain features and benefits
April 18, 2023
20
Cluster Computer Architecture
April 18, 2023
21
Beowulf
Head Node
Switch
Client Node1 Client Node2
Head Node
Compute nodes
•Login•Compile•Submit job
•Run tasks
April 18, 2023
22
Prominent Components of Cluster Computers (I)
• Multiple High Performance Computers– PCs– Workstations– SMPs (CLUMPS)– Distributed HPC Systems leading to
Metacomputing
April 18, 2023
23
Prominent Components of Cluster Computers (II)
• State of the art Operating Systems– Linux (Beowulf)
– Microsoft NT (Illinois HPVM)
– SUN Solaris (Berkeley NOW)
– IBM AIX (IBM SP2)
– HP UX (Illinois - PANDA)
– Mach (Microkernel based OS) (CMU)
– Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX (academic project)
– OS gluing layers (Berkeley Glunix)
April 18, 2023
24
Prominent Components of Cluster Computers (III)
• High Performance Networks/Switches– Ethernet (10Mbps), Fast Ethernet (100Mbps), – InfiniteBand (1-8 Gbps)– Gigabit Ethernet (1Gbps)– SCI (Dolphin - MPI- 12micro-sec latency)– ATM– Myrinet (1.2Gbps)– Digital Memory Channel– FDDI
April 18, 2023
25
Prominent Components of Cluster Computers (IV)
• Network Interface Card–Myrinet has NIC–InfiniteBand (HBA)–User-level access support
April 18, 2023
26
Prominent Components of Cluster Computers (VI)
• Cluster Middleware– Single System Image (SSI)– System Availability (SA) Infrastructure
• Hardware – DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques
• Operating System Kernel/Gluing Layers– Solaris MC, Unixware, GLUnix
• Applications and Subsystems– Applications (system management and electronic forms)– Runtime systems (software DSM, PFS etc.)– Resource management and scheduling software (RMS)
• CODINE, LSF, PBS, NQS, etc.
April 18, 2023
27
Prominent Components of Cluster Computers (VII)
• Parallel Programming Environments and Tools– Threads (PCs, SMPs, NOW..)
• POSIX Threads• Java Threads
– MPI• Linux, NT, on many Supercomputers
– PVM– Software DSMs (Shmem)– Compilers
• C/C++/Java• Parallel programming with C++ (MIT Press book)
– RAD (rapid application development tools)• GUI based tools for PP modeling
– Debuggers– Performance Analysis Tools– Visualization Tools
April 18, 2023
28
Prominent Components of Cluster Computers (VIII)
• Applications– Sequential
– Parallel / Distributed (Cluster-aware app.)• Grand Challenging applications
– Weather Forecasting
– Quantum Chemistry
– Molecular Biology Modeling
– Engineering Analysis (CAD/CAM)
– ……………….
• PDBs, web servers,data-mining
April 18, 2023
29
Key Operational Benefits of Clustering
• High Performance• Expandability and Scalability• High Throughput• High Availability
April 18, 2023
30
Divide and Conquer
• Says 1 CPU– 1,000,000 elements– Numerical processing for 1
element = .1 secs– One computer will take
100,000 secs = 27.7 hrs
• Says 100 CPUs– .27 hr ~ 16 mins
April 18, 2023
31
Parallel Computing
• A big application is divided into Multiple tasks
• Total computation time– Computing time– Communication time
April 18, 2023
32
Summary
• HPC helps accelerates Time to insights, time to discovery and time to Market for challenging problems
• Divide and Conquer – Computing vs communication time
• Cluster computing is a predominant HPC system