Top Banner
Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexico nafems.org/americas May 23 rd | Mexico City The Effect of HDR InfiniBand and In-Network Computing on CAE Simulations Gerardo Cisneros-Stoianowski HPC-AI Advisory Council 1
28

The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Jul 20, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

The Effect of HDR InfiniBand and In-Network

Computing on CAE Simulations

Gerardo Cisneros-Stoianowski

HPC-AI Advisory Council

1

Page 2: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

The HPC-AI Advisory Council

• World-wide HPC non-profit organization

• More than 400 member companies / universities / organizations

• Bridges the gap between HPC-AI usage and its potential

• Provides best practices and a support/development center

• Explores future technologies and future developments

• Leading edge solutions and technology demonstrations

Page 3: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

HPC Advisory Council Members

Page 4: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

HPC-AI Advisory Council Cluster Center (Examples)

• Supermicro / Foxconn 32-node cluster

• Dual Socket Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz

• Dell™ PowerEdge™ R730/R630 36-node cluster

• Dual Socket Intel® Xeon® 16-core CPUs E5-2697A V4 @ 2.60 GHz

• IBM S822LC POWER8 8-node cluster

• Dual Socket IBM POWER8 10-core CPUs @ 2.86 GHz

• GPU: NVIDIA Kepler K80 GPUs

Page 5: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Multiple Applications Best Practices Published

App

App

App

App

Page 6: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Data as a Resource

20th Century 21st Century

Page 7: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

From CPU-Centric to Data-Centric Data Centers

Everything

CPU Network

Page 8: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

From CPU-Centric to Data-Centric Data Centers

Workload

Network Functions

Communication Framework (MPI)

Workload

In-CPU Computing In-Network Computing

Page 9: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Cloud andWeb 2.0

Big Data

Enterprise

Business Intelligence

HPC

Storage

Security

Machine Learning

Internet of Things

source: IDC

Exponential Data Growth Everywhere

Page 10: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

In Network Computing

CPU-Centric (Onload) Data-Centric (Offload)

Must Wait for the DataCreates Performance Bottlenecks

GPU

CPU

GPU

CPU

Onload NetworkIn-Network Computing

IPU

GPU

CPU

CPU

GPU

GPU

CPU

GPU

CPU

GPU

CPU

CPU

GPU

Analyze Data as it Moves!Higher Performance and Scale

Page 11: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

SHARP - Scalable Aggregation and Reduction Technology

• Reliable Scalable General Purpose Primitive

– In-network Tree based aggregation mechanism

– Large number of groups

– Multiple simultaneous outstanding operations

• Applicable to Multiple Use-cases

– HPC Applications using MPI / SHMEM

– Distributed Machine Learning applications

• Scalable High Performance Collective Offload

– Barrier, Reduce, All-Reduce, Broadcast and more

Topology (Physical Tree)

Page 12: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Micro Benchmark – MPI Allreduce Latency

• Oak Ridge National Laboratory – Coral Summit Supercomputer

Page 13: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Micro Benchmark – MPI Allreduce Throughput

13

Page 14: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

OpenFOAM

• OpenFOAM® (Open Field Operation and Manipulation) CFD

• Toolbox in an open source CFD applications that can simulate– Complex fluid flows involving

– Chemical reactions

– Turbulence

– Heat transfer

– Solid dynamics

– Electromagnetics

– The pricing of financial options

• OpenFOAM support can be obtained from OpenCFD Ltd

Page 15: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

OpenFOAM Performance (motorBike_160)

27%

Page 16: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

OpenFOAM Scalability per Interconnect Technology

Page 17: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

OpenFOAM Scalability

• University of Toronto Nigeria Supercomputer

• Dragonfly+ InfiniBand EDR

• 91% Scalability

17

Page 18: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

ANSYS Fluent MPI Performance

30%

Page 19: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

LSTC LS-DYNA

• LS-DYNA– A general purpose structural and fluid analysis simulation software

package capable of simulating complex real world problems

– Developed by the Livermore Software Technology Corporation (LSTC)

• LS-DYNA used by– Automobile

– Aerospace

– Construction

– Military

– Manufacturing

– Bioengineering

Page 20: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3cars Profiling - % of MPI Time

Page 21: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3cars Profiling – Communication Balance

Page 22: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3cars Profiling – Message Buffer Size

Page 23: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3cars Profiling – Memory Usage

Page 24: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3 Vehicle Collision (3cars)

Page 25: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

3 Vehicle Collision (3cars)

Page 26: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Summary

• HPC cluster environments impose high demands on connectivity throughput and low latency with low CPU overhead, network flexibility, and high efficiency

• Fulfilling these demands enables the maintenance of a balanced system that can achieve high application performance and high scaling

• With the increase in number of CPU cores and application threads, there is a need to develop a new HPC cluster architecture - a data-focused architecture

• The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance and scalability barriers

• The OpenFoam, ANSYS Fluent and LS-DYNA applications were benchmarked for this study to demonstrate the significant advantages of HDR InfiniBand as well as linear scalability with In-Network Computing technology

Page 27: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

2019 HPC-AI Advisory Council Activities

• HPC-AI Advisory Council– More then 400 members, http://www.hpcadvisorycouncil.com/

– Application best practices, case studies

– Benchmarking center with remote access for users

– World-wide conferences

• 2019 Conferences– USA (Stanford University) – February

– Switzerland (CSCS) – April

– Australia - August

– Spain (BSC) – Sep

– China (HPC China) – October

• 2019 Competitions– APAC HPC-AI Competition - March

– China - 6th Annual RDMA Competition - May

– ISC Germany - 7th Annual Student Cluster Competition - June

• For more information – www.hpcadvisorycouncil.com

[email protected]

Page 28: The Effect of HDR InfiniBand and In-Network Computing on ... · • The Co-Design collaboration enables the development of In-Network Computing technology that breaks the performance

Confidence in Engineering Simulation: The Next 10 Years of CAE in Mexiconafems.org/americas May 23rd | Mexico City

Thank You!

28