Flexible, Scalable Mesh and Data Management using PETSc DMPlex M. Lange 1 M. Knepley 2 L. Mitchell 3 G. Gorman 1 1 AMCG, Imperial College London 2 Computation Institute, University of Chicago 3 Computing, Imperial College London June 16, 2015 M. Lange, M. Knepley, L. Mitchell, G. Gorman DMPlex Mesh Management
25
Embed
Flexible, Scalable Mesh and Data Management using PETSc DMPlex€¦ · I Mesh management optimisations I Scalable read/write routines I Parallel partitioning and load-balancing I
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Flexible, Scalable Mesh and Data Managementusing PETSc DMPlex
M. Lange1 M. Knepley2 L. Mitchell3 G. Gorman1
1AMCG, Imperial College London2Computation Institute, University of Chicago
3Computing, Imperial College London
June 16, 2015
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Motivation
Unstructured Mesh Management
Parallel Mesh Distribution
Firedrake
Summary
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
MotivationMesh management
I Many tasks are common across applications:Mesh input, partitioning, checkpointing, . . .
I File I/O can become severe bottleneck!
Mesh file formatsI Range of mesh generators and formats
1F. Rathgeber, D. Ham, L. Mitchell, M. Lange, F. Luporini, A. McRae, G. Bercea, G. Markall, and P. Kelly. Firedrake: Automating thefinite element method by composing abstractions. Submitted to ACM TOMS, 2015
2A. Logg, K.-A. Mardal, and G. Wells. Automated Solution of Differential Equations by the Finite Element Method. Springer, 2012
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Firedrake
Firedrake - Automated FiniteElement computation
I Implements UFL1
I Outer framework in PythonI Run-time C code generationI PyOP2: Assembly kernel
execution framework
I Domain topology from DMPlexI Mesh generation and file I/OI Derive discretisation-specific
1M. Alnæs, A. Logg, K. Ølgaard, M. Rognes, and G. Wells. Unified Form Language: A domain-specific language for weak formulationsof partial differential equations. ACM Transactions on Mathematical Software (TOMS), 40(2):9, 2014
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Firedrake
Firedrake - Data structures
I DMPlex encodes topologyI Parallel distributionI Application ordering
I Section encodes discretisationI Maps DAG to solution DoFsI Generated via FIAT element1
I Derives PyOP2 indirectionmaps for assembly
I SF performs halo exchangeI DMPlex derives SF from
1R. Kirby. FIAT, A new paradigm for computing finite element basis functions. ACM Transactions on Mathematical Software (TOMS),30(4):502–516, 2004
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Firedrake
PyOP2 - Kernel execution
I Run-time code generationI Intermediate representationI Kernel optimisation via AST1
I Overlapping communicationI Core: Execute immediatelyI Non-core: Halo-dependentI Halo: Communicate while
computing over core
I Imposes ordering constraint
Partition 0 Partition 1
1F. Luporini, A. Varbanescu, F. Rathgeber, G.-T. Bercea, J. Ramanujam, D. Ham, and P. Kelly. Cross-Loop Optimization of ArithmeticIntensity for Finite Element Local Assembly. Accepted for publication, ACM Transactions on Architecture and Code Optimization, 2015
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Firedrake
Firedrake - RCM reordering
I Mesh renumberingI Improves cache coherencyI Reverse Cuhill-McKee (RCM)
I Combine RCM with PyOP2ordering1
I Filter cell reorderingI Apply within PyOP2 classesI Add DoFs per cell (closure)
Native RCM
Seq
uen
tia
lP
ara
llel
1M. Lange, L. Mitchell, M. Knepley, , and G. Gorman. Efficient mesh management in Firedrake using PETSc-DMPlex. Submitted toSISC Special Issue, 2015
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management
Firedrake PerformanceIndirection cost of assembly loops:
I Cell integral: L = u * dxI Facet integral: L = u(’+’)* dS
100 loops on P1
1 2 6 12 24 48 96Number of processors
10-2
10-1
100
tim
e [
sec]
Cell integral, RCM
Facet integral, RCM
Cell integral, Native
Facet integral, Native
100 loops on P3
1 2 6 12 24 48 96Number of processors
10-2
10-1
100
tim
e [
sec]
Cell integral, RCM
Facet integral, RCM
Cell integral, Native
Facet integral, Native
M. Lange, M. Knepley, L. Mitchell, G. GormanDMPlex Mesh Management