1 Experiences with Experiences with Large Scale Numerical Simulation Large Scale Numerical Simulation Lehrstuhl für Informatik 10 (Systemsimulation) www10.informatik.uni-erlangen.de Dundee, June 28, 2005 Zur Anzeige wird der QuickTime Dekompressor Cinepak bentigt. Ulrich Rüde ([email protected])
79
Embed
Experiences with Large Scale Numerical Simulation · 2015. 5. 13. · 1 Experiences with Large Scale Numerical Simulation Lehrstuhl für Informatik 10 (Systemsimulation) Dundee, June
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Experiences withExperiences withLarge Scale Numerical Simulation Large Scale Numerical Simulation
Lehrstuhl für Informatik 10
(Systemsimulation)
www10.informatik.uni-erlangen.de
Dundee, June 28, 2005
Zur Anzeige wird der QuickTime Dekompressor Cinepak
Material science and process technology:Metal FoamsNano TechnologyBiomedical Technology:The Inverse EEG problem
High End ComputingTrends in High End Numerical ComputingParallel Hierarchical Hybrid Grids (HHG) for FE simulationsParallel Lattice Boltzmann Methods for Free Surface Flow
Conclusions
3
Part I
Motivation:Motivation:Computational ScienceComputational Science and Engineering (CSE) and Engineering (CSE)
4
MotivationMotivation
“The panels overarching finding is that a new age has dawned in scientific and engineering
research …”
(from the “NSF report on Cyberinfrastructure”, Feb. 2003)
…. this revolution is driven by
Simulation for Technology and ScienceSimulation for Technology and Science
5
PITAC Report to the US President on PITAC Report to the US President on
Computational science is now indispensable to the solution of complex problems in every sector, from traditional science and engineering domains to such key areas as national security, public health, and economic innovation. Advances in computing and connectivity make it possible to develop computational models and capture and analyze unprecedented amounts of experimental and observational data to address problems previously deemed intractable or
beyond imagination.
6
The Two Principles of ScienceThe Two Principles of Science
TheoryTheoryMathematical Mathematical Models, Differential Models, Differential Equations, NewtonEquations, Newton
ExperimentsExperimentsObservation and Observation and prototypesprototypes
VirtuVirtuaal l ExperimentsExperimentsVirtual PrototypesVirtual Prototypes
Virtual Virtual RealitRealityy
AlgorithmicAlgorithmic Modelling forModelling for PPhysihysicscs,, C Chemihemistry;stry;
Electrical Mechanical, Chemical Electrical Mechanical, Chemical Engineering; Material SciencesEngineering; Material Sciences
Bio- and Medical SciencesBio- and Medical Sciences,,……
8
CSE is a broad CSE is a broad multidisciplinarymultidisciplinary area that encompasses area that encompasses applicationsapplications in science/engineering, applied mathematics, in science/engineering, applied mathematics, numerical analysis, and computer science. numerical analysis, and computer science. Computer models Computer models and computer simulationsand computer simulations have become an important part of the have become an important part of the research repertoire, supplementing (and in some cases research repertoire, supplementing (and in some cases replacing) experimentation. Going from application area to replacing) experimentation. Going from application area to computational resultscomputational results requires domain expertise, requires domain expertise, mathematical mathematical modeling, numerical analysis, algorithm development, software modeling, numerical analysis, algorithm development, software implementation, program execution, analysis, validation and implementation, program execution, analysis, validation and visualization of resultsvisualization of results. CSE involves all of this. CSE involves all of this..
CSE makes use of the techniques of applied mathematics and computer CSE makes use of the techniques of applied mathematics and computer science for the science for the development of problem-solving methodologiesdevelopment of problem-solving methodologies and and robust tools which will be the building blocks for solutions to scientific robust tools which will be the building blocks for solutions to scientific and engineering problems of ever-increasing complexity. It and engineering problems of ever-increasing complexity. It differs from differs from mathematics or computer sciencemathematics or computer science in that analysis and methodologies are in that analysis and methodologies are directed directed specificallyspecifically at the solution of problem classes from at the solution of problem classes from science and science and engineeringengineering, and will generally require a detailed knowledge or , and will generally require a detailed knowledge or substantial substantial collaborationcollaboration from those disciplines. The computing and from those disciplines. The computing and mathematical techniques used may be more domain specific, and the mathematical techniques used may be more domain specific, and the computer science and mathematics skills needed will be broader.computer science and mathematics skills needed will be broader. It is It is more thanmore than a scientist or engineer a scientist or engineer using a canned codeusing a canned code to generate and visualize results (skipping all of the to generate and visualize results (skipping all of the intermediate steps).intermediate steps).
SIAM's Definition of CSESIAM's Definition of CSE (2) (2)What is it NOT!What is it NOT!
10
Part IIa
Metal Foams Metal Foams
In collaboration with theIn collaboration with theInstitut für Werkstoffwissenschaften Institut für Werkstoffwissenschaften
Lehrstuhl Werkstoffkunde und Technologie der Metalle Lehrstuhl Werkstoffkunde und Technologie der Metalle WTM (R.F. Singer, WTM (R.F. Singer, C. KörnerC. Körner))
11
GlassCeramics
MetalsPolymers
Structural Properties stiffness
energy absorption damping
Functional Properties burner, shock absorber,
heat exchanger, batteries
large, dynamic surface expansion
Examples of FoamsExamples of Foams
12
Towards Simulating Metal FoamsTowards Simulating Metal Foams
Bubble growth, Bubble growth, coalescence, collapse, coalescence, collapse, drainage,drainage, rheology, etc. are rheology, etc. are still poorly understoodstill poorly understood
• Simulation as a tool to Simulation as a tool to better understand, control better understand, control and optimize the processand optimize the process
13
The Lattice-Boltzmann MethodThe Lattice-Boltzmann Method
Based on cellular automataIntroduced by von Neumann around 1940
Famous: Conway’s Game of Life
Complex system with simple rulesRegular grid
Local rules specifying time evolution
Intrinsically parallel for model & simulation, similar to elliptic PDE solvers
14
The Lattice-Boltzmann MethodThe Lattice-Boltzmann Method
Weakly compressible approximation of the Navier-Stokes equations
Easy implementation
Applicable for small Mach numbers (< 0.1)
Easy to adapt, e.g. forComplicated or time-varying geometries
Free surfaces
Additional physical and chemical effects
15
The Lattice-Boltzmann MethodThe Lattice-Boltzmann MethodReal valued representation of particles
Discrete velocities and positions
Algorithm consists of two steps:
Stream
Collide
16
The Stream StepThe Stream Step
Move particle distribution functions along corresponding velocity vector
Normalized time step, cell size, and particle speed
17
The Collide StepThe Collide Step
“Computes collisions” of particles in cell
Weigh equilibrium velocities and velocities from streaming depending on fluid viscosity
Collaborators: Univ. of Utah (Chris Johnson), Ovidius Univ. Constanta (C. Popa)Collaborators: Univ. of Utah (Chris Johnson), Ovidius Univ. Constanta (C. Popa)Bart Vanrumste (Gent, Univ. of Canterbury, New Zealand), G. Greiner, F. FahlbuschBart Vanrumste (Gent, Univ. of Canterbury, New Zealand), G. Greiner, F. Fahlbusch
41
Part IIIa
High Performance ComputingHigh Performance ComputingTrends in High End ComputingTrends in High End Computing
Moore's Law in Semiconductor Technology(F. Hossfeld)
80468
Pentium Pro
49
1021
1018
1015
1012
109
1024
103
106
1
1950 1960 1970 1980 1990 2000 2010 2020
Year
Ato
ms/
Bit
Information Density & Energy Dissipation(adapted by F. Hossfeld from C. P. Williams et al., 1998)
10 -9
10 -6
10 -3
1
10 3
10 6
10 9
1012
1015
En
erg
y/lo
gic
Op
era
tion
[p
ico
-Jou
les]
kT
Semiconductor Technology
≈ 2017
50
Current Challenge:Current Challenge:Parallelism on all levels andParallelism on all levels and
The Memory WallThe Memory Wall
Parallel computing is easy, good (single) processor performance is Parallel computing is easy, good (single) processor performance is difficult (B. Gropp, Argonne)difficult (B. Gropp, Argonne)
There has been no significant progress in High Performance Computing There has been no significant progress in High Performance Computing
over the past 5 years (H. Simon, NERSC)over the past 5 years (H. Simon, NERSC)
resolves geometry of problem domainresolves geometry of problem domainPatch-wise regular refinementPatch-wise regular refinement
applied repeatedly to every cell of the coarse gridapplied repeatedly to every cell of the coarse gridgenerates nested grid hierarchies naturally suitable generates nested grid hierarchies naturally suitable for geometric multigrid algorithmsfor geometric multigrid algorithms
New: New: Modify storage formats and operations on the grid to Modify storage formats and operations on the grid to exploit the exploit the regular substructuresregular substructures
54
Common misconceptionsCommon misconceptions
Hierarchical hybrid grids (HHG) Hierarchical hybrid grids (HHG) are not yet another block structured gridare not yet another block structured grid
HHG are more flexible (HHG are more flexible (unstructured, hybrid unstructured, hybrid input gridsinput grids))
are not yet another unstructured geometric multigrid are not yet another unstructured geometric multigrid packagepackage
HHG achieve better performance -- HHG achieve better performance -- unstructured treatment of regular regions does unstructured treatment of regular regions does not improve performancenot improve performance
55
Refinement example
Input Grid
56
Refinement example
Refinement Level one
57
Refinement example
Refinement Level Two
58
Refinement example
Structured InteriorStructured Interior
59
Refinement example
Structured Interior
60
Refinement example
Edge Interior
61
Refinement example
Edge Interior
62
Problems and Solutions
Problems with C++ on HitachiOnly alpha version quality of C++
Excessive compile times
Poor code quality
Solution for gridlib-HHGconservative C++
resorting to mixed language programming C++/F77 (after painful experience)
63
Results, Scaling, EfficiencyResults, Scaling, Efficiency(results by F. Hülsemann, Ben Bergen)(results by F. Hülsemann, Ben Bergen)
Brick-shaped Finite elementsBrick-shaped Finite elements
Performance lousy on a single node! Conditionals: 2,9 SLBM 51 free surface LBMPentium 4: almost no degradation ~ 10%SR 8000: enormous degradation (pseudo-vector, predictable jumps)
71
Part IV
ConclusionsConclusions
72
Conclusions (1)Conclusions (1)High performance simulation still requires
“heroic programming”Parallel Programming is easy, node performance is difficult (B. Gropp, Argonne)Which architecture ?
ASCI-type: custom CPU, massively parallel cluster of SMPs• nobody has been able to show that these machines scale efficiently,
except on a few very special applications and using enormous human effort
Earth-simulator-type: Vector CPU, as many CPUs as affordable• impressive performance on vectorizable code, but need to check with
more demanding data and algorithm structuresHitachi Class: modified custom CPU, cluster of SMPs
• excellent performance on some codes, but unexpected slowdowns on others, too exotic to have a sufficiently large software base
Conclusions (2)Conclusions (2)Which data structures?
structured (inflexible) unstructured (slow)HHG (high development effort, even prototype 50 K lines of code)meshless … (useful in niches)
Where are we going?the end of Moore’s lawnobody builds CPUs with numerical simulation requirements high on the list of priorities.petaflops: 100,000 processors and we can hardly handle 1000It’s the locality - stupid!the memory wall
• latency• bandwidth
Distinguish between algorithms where control flow is• data independent: latency hiding techniques(pipelining, prefetching, etc)
Bavarian Graduate School in Computational Engineering (with TUM, Jan. 2005)Bavarian Graduate School in Computational Engineering (with TUM, Jan. 2005)Special International PhD program: Identifikation, Optimization and Optimal Control Special International PhD program: Identifikation, Optimization and Optimal Control for Engineering Applications (with Bayreuth and Würzburg) starting for Engineering Applications (with Bayreuth and Würzburg) starting Jan. 06Jan. 06