-
INTRODUCTION TO FREEFEMWITH AN EMPHASIS ON PARALLEL
COMPUTING
Pierre
Jolivethttp://jolivet.perso.enseeiht.fr/FreeFem-tutorial
Browser shortcuts: ◦ Ctrl + f fit to width/height
◦ Ctrl + g go to a page #
◦ Ctrl + PgDwn next page
◦ Ctrl + PgUp previous page
v2020.07
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial
-
INTRODUCTION
-
ACKNOWLEDGEMENTS I
◦ University of Tsukuba, Tokyo, Japan◦ Institute of Mathematics,
University of Seville, Spain◦ CMAP, École Polytechnique, France
1
-
ACKNOWLEDGEMENTS II
◦ FreeFEM BDFL Frédéric Hecht◦ contributions from Christopher M.
Douglas◦ HPDDM feedbacks
• Pierre-Henri Tournier• Pierre Marchand
◦ PETSc/SLEPc feedbacks• Johann Moulin• Julien Garaud
2
-
PREREQUISITES (THE MORE THE BETTER)
◦ FreeFEM [Hecht 2012]
https://github.com/FreeFem/FreeFem-sources◦ Gmsh [Geuzaine and
Remacle 2009] http://gmsh.info◦ ParaView https://www.paraview.org◦
MPI https://www.mpi-forum.org◦ HPDDM [Jolivet, Hecht, Nataf, et al.
2013] https://github.com/hpddm/hpddm◦ PETSc [Balay et al. 1997]
https://gitlab.com/petsc/petsc◦ SLEPc [Hernandez et al. 2005]
https://gitlab.com/slepc/slepc
3
https://github.com/FreeFem/FreeFem-sourceshttp://gmsh.infohttps://www.paraview.orghttps://www.mpi-forum.orghttps://github.com/hpddm/hpddmhttps://gitlab.com/petsc/petschttps://gitlab.com/slepc/slepc
-
PROGRAM OF THE LECTURE
1. Introduction
2. Finite elements
3. FreeFEM
4. Shared-memory parallelism
5. Distributed-memory parallelism
6. Overlapping Schwarz methods
7. Substructuring methods
8. PETSc
9. SLEPc
10. Applications
-
FINITE ELEMENTS
-
MODEL PROBLEM
−∆u = f in Ω
u = gD on ΓD∂nu = gN on ΓN
x
y
Ω
◦ essential boundary conditions◦ natural boundary conditions
5
-
MODEL PROBLEM
−∆u = f in Ωu = gD on ΓD
∂nu = gN on ΓN
ΓD
x
y
Ω
◦ essential boundary conditions
◦ natural boundary conditions
5
-
MODEL PROBLEM
−∆u = f in Ωu = gD on ΓD
∂nu = gN on ΓN
ΓD
ΓN
x
y
Ω
◦ essential boundary conditions◦ natural boundary conditions
5
-
VARIATIONAL FORMULATION
Green’s theoremFind u ∈ H1(Ω) such that∫
Ω
∇u · ∇v =∫Ω
f v+∫ΓN
gN v, ∀v ∈ H1ΓD(Ω)
u = gD on ΓD
◦ unknown function u◦ test function v
6
-
VARIATIONAL FORMULATION
Green’s theoremFind u ∈ H1(Ω) such that∫
Ω
∇u · ∇v =∫Ω
f v+∫ΓN
gN v, ∀v ∈ H1ΓD(Ω)
u = gD on ΓD
◦ unknown function u◦ test function v
6
-
MESH AND FINITE ELEMENTS
◦ Ω discretized by Ωh with nh elements◦ u discretized by uh
=
∑Nhi=1 uh(i)φi, smooth w.r.t. Ωh
uh ∈ H1(Ωh) ⇐⇒ uh is continuous◦ basis functions {φi}Nhi=1
7
-
MESH AND FINITE ELEMENTS
◦ Ω discretized by Ωh with nh elements◦ u discretized by uh
=
∑Nhi=1 uh(i)φi, smooth w.r.t. Ωh
uh ∈ H1(Ωh) ⇐⇒ uh is continuous
◦ basis functions {φi}Nhi=1
7
-
MESH AND FINITE ELEMENTS
◦ Ω discretized by Ωh with nh elements◦ u discretized by uh
=
∑Nhi=1 uh(i)φi, smooth w.r.t. Ωh
uh ∈ H1(Ωh) ⇐⇒ uh is continuous◦ basis functions {φi}Nhi=1
7
-
ASSEMBLY PROCEDURES
Matrix
∀(i, j) ∈ J1;NhK2,Aij = ∫Ωh
∇φj · ∇φi
=⇒ only integrate in the intersection of both supports
Right-hand side
∀i ∈ J1;NhK,bi = ∫Ωh
f φi +∫ΓhN
gN φi
=⇒ numerical integration using quadrature rules
8
-
ASSEMBLY PROCEDURES
Matrix
∀(i, j) ∈ J1;NhK2,Aij = ∫Ωh
∇φj · ∇φi
=⇒ only integrate in the intersection of both supports
Right-hand side
∀i ∈ J1;NhK,bi = ∫Ωh
f φi +∫ΓhN
gN φi
=⇒ numerical integration using quadrature rules
8
-
ESSENTIAL BOUNDARY CONDITIONS
GD subset of unknowns associated to Dirichlet BC
◦ nonsymmetric elimination
Ax =[AGDGD AGDGD0 IGDGD
][xGDxGD
]=
[bGDgD
]
◦ symmetric elimination
Ax =[AGDGD 00 IGDGD
][xGDxGD
]=
[bGD − AGDGDgD
gD
]
◦ penalization
Ax =[AGDGD AGDGDAGDGD AGDGD + 10
30IGDGD
][xGDxGD
]=
[bGD
1030gD
]
9
-
ESSENTIAL BOUNDARY CONDITIONS
GD subset of unknowns associated to Dirichlet BC
◦ nonsymmetric elimination
Ax =[AGDGD AGDGD0 IGDGD
][xGDxGD
]=
[bGDgD
]
◦ symmetric elimination
Ax =[AGDGD 00 IGDGD
][xGDxGD
]=
[bGD − AGDGDgD
gD
]
◦ penalization
Ax =[AGDGD AGDGDAGDGD AGDGD + 10
30IGDGD
][xGDxGD
]=
[bGD
1030gD
]
9
-
ESSENTIAL BOUNDARY CONDITIONS
GD subset of unknowns associated to Dirichlet BC
◦ nonsymmetric elimination
Ax =[AGDGD AGDGD0 IGDGD
][xGDxGD
]=
[bGDgD
]
◦ symmetric elimination
Ax =[AGDGD 00 IGDGD
][xGDxGD
]=
[bGD − AGDGDgD
gD
]
◦ penalization
Ax =[AGDGD AGDGDAGDGD AGDGD + 10
30IGDGD
][xGDxGD
]=
[bGD
1030gD
]9
-
FREEFEM
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3
-
HISTORY
1987 MacFem/PCFem by Pironneau1992 FreeFem by Pironneau,
Bernardi, Hecht, and Prud’homme1996 FreeFem+ by Pironneau,
Bernardi, and Hecht1998 FreeFem++ by Hecht, Pironneau, and
Ohtsuka2008 version 32014 version 3.34 with distributed-memory
parallelism2018 moving to GitHub2019 version 4, rebranded as
FreeFEM
10
-
INSTALLATION
◦ packages available for most OSes• Windows• macOS• Debian
◦ compilation from the sources for more flexibility• custom
PETSc/SLEPc installation• develop branch
◦ on the cloud• https://www.rescale.com• http://qarnot.com
=⇒ https://community.freefem.org
11
https://www.rescale.comhttp://qarnot.comhttps://community.freefem.org
-
INSTALLATION
◦ packages available for most OSes• Windows• macOS• Debian
◦ compilation from the sources for more flexibility• custom
PETSc/SLEPc installation• develop branch
◦ on the cloud• https://www.rescale.com• http://qarnot.com
=⇒ https://community.freefem.org11
https://www.rescale.comhttp://qarnot.comhttps://community.freefem.org
-
STANDARD COMPILATION PROCESS I
◦ remove any previous instance of FreeFEM◦ install gcc/clang and
gfortran◦ make sure you have a working MPI implementation
12
-
STANDARD COMPILATION PROCESS II
> git clone https://github.com/FreeFem/FreeFem-sources> cd
FreeFem-sources> git checkout develop> autoreconf -i>
./configure --enable-download --disable-iohdf5--with-hdf5=no
--prefix=${PWD}
> cd 3rdparty/ff-petsc> make petsc-slepc> cd ->
./reconfigure> make
12
-
BINARIES
◦ FreeFem++◦ FreeFem++-mpi◦ ffglut◦ ffmedit◦ ff-c++
Basic parameters◦ -ns◦ -nw/-wg◦ -v 0
script not printedgraphical output deactivatedlevel of
verbosity
13
-
mesh AND mesh3 [EXAMPLE2.EDP]
Structured meshes◦ in 2D, square◦ in 3D, cube
Unstructured meshes◦ buildmesh◦ interface with Gmsh, TetGen [Si
2013], and MMG
Online visualization◦ mostly for debugging purposes◦ prefer
medit over plot in 3D
14
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example2.edp
-
fespace [EXAMPLE3.EDP]
◦ formal relationship between a mesh and a FE◦ one accessible
member .ndof◦ may be used to define finite element functions
15
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example3.edp
-
fespace [EXAMPLE3.EDP]
◦ formal relationship between a mesh and a FE◦ one accessible
member .ndof◦ may be used to define finite element functions
15
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example3.edp
-
varf AND on [EXAMPLE4.EDP]
◦ bilinear forms◦ linear forms◦ boundary conditions◦ dynamic
definition◦ instantiated to assemble matrices or vectors◦ qforder
to change integration rules
16
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example4.edp
-
varf AND on [EXAMPLE4.EDP]
◦ bilinear forms◦ linear forms◦ boundary conditions◦ dynamic
definition◦ instantiated to assemble matrices or vectors◦ qforder
to change integration rules
16
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example4.edp
-
matrix AND set [EXAMPLE5.EDP]
◦ assemble a varf using a pair of fespaces◦ variable sym to
assemble the upper triangular part◦ matrix–vector products◦ specify
a solver for linear systems with set
Essential boundary conditions◦ tgv = -1 for nonsymmetric
elimination◦ tgv = -2 for symm. elim. (careful about the RHS)◦ tgv
= 1e+30 for penalization
17
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example5.edp
-
matrix AND set [EXAMPLE5.EDP]
◦ assemble a varf using a pair of fespaces◦ variable sym to
assemble the upper triangular part◦ matrix–vector products◦ specify
a solver for linear systems with set
Essential boundary conditions◦ tgv = -1 for nonsymmetric
elimination◦ tgv = -2 for symm. elim. (careful about the RHS)◦ tgv
= 1e+30 for penalization
17
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example5.edp
-
real[int] AND OTHER ARRAYS [EXAMPLE6.EDP]
◦ real[int,int]◦ real[int][int]◦ complex[int]◦ matrix[int],
string[int]◦ formal array of finite element functions
Common members and methods◦ .n and .m◦ .resize◦ =, +=, /=
18
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example6.edp
-
ADDITIONAL PLUGINS [EXAMPLE7.EDP]
◦ keyword load◦ load "something" =⇒ ff-c++ -auto something.cpp◦
FreeFEM objects manipulated in C++/Fortran
19
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example7.edp
-
func [EXAMPLE7.EDP]
◦ user-defined functions◦ may be passed to external codes◦
useful for matrix-free computations◦ LinearCG, EigenValue
20
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example7.edp
-
macro [EXAMPLE7.EDP]
◦ evaluated when parsing input files◦ defined on the command
line -DmacroName=value◦ conditional statements using IFMACRO
21
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example7.edp
-
SUBMESH [EXAMPLE8.EDP]
◦ trunc optional parameter new2old◦ + restrict to go from one
fespace to another◦ useful to avoid (costly) interpolations◦ meshes
in multiphysics, e.g., solid + fluid domains
22
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_3/example8.edp
-
SHARED-MEMORY PARALLELISM
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4
-
MOTIVATION
Meshgeneration
Right-handside
assembly
Matrixassembly
Linear solve
Solutionexporting
0
20
40
60
80
Time(s)
23
-
MOTIVATION
Meshgeneration
Right-handside
assembly
Matrixassembly
Linear solve
Solutionexporting
0
20
40
60
80
Time(s)
Parallelism is key for performance23
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
trianglesColor #7: 4 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
trianglesColor #7: 4 trianglesColor #8: 6 triangles
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
trianglesColor #7: 4 trianglesColor #8: 6 trianglesColor #9: 1
triangle
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
trianglesColor #7: 4 trianglesColor #8: 6 trianglesColor #9: 1
triangleColor #10: 1 triangle
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Shared-memory parallelism (pthreads, OpenMP)◦ global address
space◦ minimize critical sections◦ mostly suited to small-scale
architectures
Color #1: 10 trianglesColor #2: 11 trianglesColor #3: 8
trianglesColor #4: 10 trianglesColor #5: 9 trianglesColor #6: 8
trianglesColor #7: 4 trianglesColor #8: 6 trianglesColor #9: 1
triangleColor #10: 1 triangle
24
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example1.edp
-
OTHER KERNELS
◦ efficient assembly is hard (especially for low-order FE)◦
linear algebra◦ exact factorizations (LU or LDLH)
25
-
DIRECT SOLVERS AND BLAS
◦ three options• MKL PARDISO, Intel [EXAMPLE2.EDP]• Dissection
[Suzuki and Roux 2014]• MUMPS_seq
◦ MKL for dense linear algebra [EXAMPLE3.EDP]◦ be careful with
OMP_NUM_THREADS
26
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example2.edphttp://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_4/example3.edp
-
DISTRIBUTED-MEMORY PARALLELISM
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
Subdomain #1: 17 triangles
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
Subdomain #1: 17 trianglesSubdomain #2: 17 triangles
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
Subdomain #1: 17 trianglesSubdomain #2: 17 trianglesSubdomain
#3: 17 triangles
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
Subdomain #1: 17 trianglesSubdomain #2: 17 trianglesSubdomain
#3: 17 trianglesSubdomain #4: 17 triangles
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
IMPACT ON THE FINITE ELEMENT METHOD [EXAMPLE1.EDP]
Distributed-memory parallelism (MPI)◦ local address space◦
distribute data efficiently to minimize communication◦ orthogonal
to shared-memory parallelism
Subdomain #1: 17 trianglesSubdomain #2: 17 trianglesSubdomain
#3: 17 trianglesSubdomain #4: 17 triangles
27
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example1.edp
-
MESSAGE PASSING INTERFACE [EXAMPLE2.EDP]
FreeFem++-mpi◦ just like FreeFem++, but with MPI◦ a friendly
approach to message passing (like mpi4py)◦ some new types like
mpiComm or mpiRequest◦ no problem with mesh, matrix, arrays
28
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example2.edphttps://mpi4py.readthedocs.io/en/stable/
-
ASSEMBLY AND DIRECT SOLVERS
◦ naive approach to distributed computing◦ assuming some global
variables may be replicated
Legacy interfaces
◦ MUMPS [Amestoy et al. 2001]◦ PaStiX [Hénon et al. 2002]◦
SuperLU_DIST [Li 2005]
29
-
ASSEMBLY AND DIRECT SOLVERS
◦ naive approach to distributed computing◦ assuming some global
variables may be replicated
Legacy interfaces
◦ MUMPS [Amestoy et al. 2001]◦ PaStiX [Hénon et al. 2002]◦
SuperLU_DIST [Li 2005]
29
-
ASSEMBLY AND DIRECT SOLVERS
◦ naive approach to distributed computing◦ assuming some global
variables may be replicated
Legacy interfaces
◦ MUMPS [Amestoy et al. 2001]◦ PaStiX [Hénon et al. 2002]◦
SuperLU_DIST [Li 2005]
29
-
ASSEMBLY AND DIRECT SOLVERS
◦ naive approach to distributed computing◦ assuming some global
variables may be replicated
Legacy interfaces⇐ use at your OWN RISK!
◦ MUMPS [Amestoy et al. 2001]◦ PaStiX [Hénon et al. 2002]◦
SuperLU_DIST [Li 2005]
29
-
PARAVIEW
◦ more powerful postprocessing tool◦ handle distributed
solutions◦ generate movies if needed◦ savevtk("sol.vtu",Th,sol)
30
-
EXAMPLE 3: HEAT EQUATION [EXAMPLE3.EDP]
AimSolve the transient PDE
∂u∂t −∆u = 1 in Ω× [0; T]
u(x, y, z, 0) = 1 in Ωu(x, y, z, t) = 1 on Γ× [0; T]
Implicit Euler scheme∫Ω
un+1 − undt w+∇u
n∇w =∫Ω
w
=⇒
(M+ dt · A)un+1 = Mun + dt · b
31
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example3.edp
-
EXAMPLE 3: HEAT EQUATION [EXAMPLE3.EDP]
AimSolve the transient PDE
∂u∂t −∆u = 1 in Ω× [0; T]
u(x, y, z, 0) = 1 in Ωu(x, y, z, t) = 1 on Γ× [0; T]
Implicit Euler scheme∫Ω
un+1 − undt w+∇u
n∇w =∫Ω
w
=⇒
(M+ dt · A)un+1 = Mun + dt · b
31
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example3.edp
-
EXAMPLE 3: HEAT EQUATION [EXAMPLE3.EDP]
AimSolve the transient PDE
∂u∂t −∆u = 1 in Ω× [0; T]
u(x, y, z, 0) = 1 in Ωu(x, y, z, t) = 1 on Γ× [0; T]
Implicit Euler scheme∫Ω
un+1 − undt w+∇u
n∇w =∫Ω
w
=⇒
(M+ dt · A)un+1 = Mun + dt · b31
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_5/example3.edp
-
OVERLAPPING SCHWARZ METHODS
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6
-
HISTORY
◦ initially focused on domain decomposition methods◦ new
developments around iterative methods [Jolivetand Tournier
2016]
◦ only interfaced with FreeFEM at first◦ low-level languages
(C/C++, Fortran, Python)
◦ first commit in late 2011◦ open-sourced in late 2014◦
https://github.com/hpddm/hpddm◦ integrated in PETSc in 2019
32
https://github.com/hpddm/hpddm
-
HISTORICAL METHOD
◦ due to Schwarz (1870)◦ how to solve Poisson equation on
“complex” geometries?
Ω
◦ by using solvers that work on simpler subdomains
33
-
HISTORICAL METHOD
◦ due to Schwarz (1870)◦ how to solve Poisson equation on
“complex” geometries?
Ω2Ω1
◦ by using solvers that work on simpler subdomains
33
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
R1 R2
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
R1 R2
RT2RT1
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
DISCRETIZED MODEL PROBLEM [EXAMPLE1.EDP]
R1 R2
RT2RT1
◦ no need for the complete mesh◦ neighboring numberings on the
overlaps
34
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example1.edp
-
UNPRECONDITIONED ITERATIVE METHODS
◦ partition of unity I =N∑i=1
RTi DiRi
◦ scalar product (u, v) =N∑i=1
(Riu,DiRiv)
◦ matrix–vector product RiAu = RiN∑j=1
RTj RjARTj DjRju
=⇒ subdomains only require “local data” [EXAMPLE2.EDP]
35
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example2.edp
-
UNPRECONDITIONED ITERATIVE METHODS
◦ partition of unity I =N∑i=1
RTi DiRi
◦ scalar product (u, v) =N∑i=1
(Riu,DiRiv)
◦ matrix–vector product RiAu = RiN∑j=1
RTj RjARTj DjRju
=⇒ subdomains only require “local data” [EXAMPLE2.EDP]
35
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example2.edp
-
UNPRECONDITIONED ITERATIVE METHODS
◦ partition of unity I =N∑i=1
RTi DiRi
◦ scalar product (u, v) =N∑i=1
(Riu,DiRiv)
◦ matrix–vector product RiAu = RiN∑j=1
RTj RjARTj DjRju
=⇒ subdomains only require “local data” [EXAMPLE2.EDP]
35
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example2.edp
-
UNPRECONDITIONED ITERATIVE METHODS
◦ partition of unity I =N∑i=1
RTi DiRi
◦ scalar product (u, v) =N∑i=1
(Riu,DiRiv)
◦ matrix–vector product RiAu = RiN∑j=1
RTj RjARTj DjRju
=⇒ subdomains only require “local data” [EXAMPLE2.EDP]
35
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example2.edp
-
OVERLAPPING PRECONDITIONERS [EXAMPLE3.EDP]
◦ additive Schwarz
M−1ASM =N∑i=1
RTi(RiARTi
)−1 Ri
◦ restricted additive Schwarz [Cai et al. 2003]
M−1RAS =N∑i=1
RTi Di(RiARTi
)−1 Ri◦ optimized restricted additive Schwarz [St-Cyr et al.
2007]
M−1ORAS =N∑i=1
RTi DiB−1i Ri
36
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example3.edp
-
OVERLAPPING PRECONDITIONERS [EXAMPLE3.EDP]
◦ additive Schwarz
M−1ASM =N∑i=1
RTi(RiARTi
)−1 Ri◦ restricted additive Schwarz [Cai et al. 2003]
M−1RAS =N∑i=1
RTi Di(RiARTi
)−1 Ri
◦ optimized restricted additive Schwarz [St-Cyr et al. 2007]
M−1ORAS =N∑i=1
RTi DiB−1i Ri
36
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example3.edp
-
OVERLAPPING PRECONDITIONERS [EXAMPLE3.EDP]
◦ additive Schwarz
M−1ASM =N∑i=1
RTi(RiARTi
)−1 Ri◦ restricted additive Schwarz [Cai et al. 2003]
M−1RAS =N∑i=1
RTi Di(RiARTi
)−1 Ri◦ optimized restricted additive Schwarz [St-Cyr et al.
2007]
M−1ORAS =N∑i=1
RTi DiB−1i Ri
36
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example3.edp
-
SOME UTILITY ROUTINES
◦ exchange for consistency in the overlap◦ statistics to get
some metrics about the DD◦ A(u,v) to compute weighted dot products◦
ChangeOperator to update a local matrix
37
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
0.0
−d2udx2 = 1
u(0) = 0Zero initial guess
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
0.8
Iteration #1
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
1.1
Iteration #2
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
1.6
Iteration #3
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
1.9
Iteration #4
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
2.4
Iteration #5
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
16.6
Iteration #50
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
27.8
Iteration #100
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
35.3
Iteration #150
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
40.2
Iteration #200
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
43.5
Iteration #250
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
45.7
Iteration #300
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
47.1
Iteration #350
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
50.0
Exact solution
x
u(x)
◦ AttachCoarseOperator not always trivial to define
38
-
COARSE GRID PRECONDITIONERS
◦ convergence independently of the decomposition
50.0
Exact solution
x
u(x)
◦ AttachCoarseOperator not always trivial to define38
-
EXAMPLE 4: GENEO COARSE OPERATOR [EXAMPLE4.EDP]
Aim
◦ robust DDM [Spillane et al. 2013]◦ ∼ incompressible elasticity
[Haferssas et al. 2017]◦ highly parametrizable [Jolivet, Hecht,
Nataf, et al. 2013]◦ scalable [Al Daas et al. 2019]
◦ -hpddm_schwarz_method [asm|ras|osm]◦ -hpddm_geneo_nu n◦
-hpddm_level_2_p p
39
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example4.edp
-
EXAMPLE 4: GENEO COARSE OPERATOR [EXAMPLE4.EDP]
Aim
◦ robust DDM [Spillane et al. 2013]◦ ∼ incompressible elasticity
[Haferssas et al. 2017]◦ highly parametrizable [Jolivet, Hecht,
Nataf, et al. 2013]◦ scalable [Al Daas et al. 2019]
◦ -hpddm_schwarz_method [asm|ras|osm]◦ -hpddm_geneo_nu n◦
-hpddm_level_2_p p
39
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_6/example4.edp
-
SUBSTRUCTURING METHODS
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7
-
ALGEBRAIC DECOMPOSITION
A =
A11 0 A1Γ0 A22 A2ΓAΓ1 AΓ2 AΓΓ
b =b1b2bΓ
=⇒(AΓΓ − AΓ1A−111 A1Γ − AΓ2A−122 A2Γ
)xΓ = bΓ − AΓ1A−111 b1 − AΓ2A−122 b2
= gΓ = g(1)Γ + g(2)Γ
Schur complement(s)
Sp = A(1)ΓΓ − AΓ1A−111 A1Γ + A(2)ΓΓ − AΓ2A−122 A2Γ
= S(1)p + S(2)p
40
-
ALGEBRAIC DECOMPOSITION
A =
A11 0 A1Γ0 A22 A2ΓAΓ1 AΓ2 AΓΓ
b =b1b2bΓ
=⇒(AΓΓ − AΓ1A−111 A1Γ − AΓ2A−122 A2Γ
)xΓ = bΓ − AΓ1A−111 b1 − AΓ2A−122 b2
= gΓ = g(1)Γ + g(2)Γ
Schur complement(s)
Sp = A(1)ΓΓ − AΓ1A−111 A1Γ + A(2)ΓΓ − AΓ2A−122 A2Γ
= S(1)p + S(2)p40
-
CONDENSED SYSTEM
Preconditioning SpxΓ = gΓ
S(1)p = S(2)p =⇒ M−1 =14
(S(1)p
−1+ S(2)p
−1)
A(i) =[Aii AiΓAΓi A(i)ΓΓ
]
=
[I 0
AΓiAii−1 I
][I 00 S(i)p
][I A(i)ΓΓ
−1AiΓ
0 I
]
41
-
CONDENSED SYSTEM
Preconditioning SpxΓ = gΓ
S(1)p = S(2)p =⇒ M−1 =14
(S(1)p
−1+ S(2)p
−1)
A(i) =[Aii AiΓAΓi A(i)ΓΓ
]
=
[I 0
AΓiAii−1 I
][I 00 S(i)p
][I A(i)ΓΓ
−1AiΓ
0 I
]
41
-
GENERAL NOTATIONS
Ω1
Ω2Ω3
[Gosselet and Rey 2006] Subdomain tearing
42
-
GENERAL NOTATIONS
Ω1
Ω2Ω3
1(1)
2(1)3(1)
4(1)5(1)1(2)
2(2)3(2)
4(2)5(2)
6(2)1(3)
2(3)
3(3)
4(3)
5(3)6(3)
7(3)
[Gosselet and Rey 2006] Local numbering
A(i) =[Aii AiΓAΓi A(i)ΓΓ
]42
-
GENERAL NOTATIONS
Ω1
Ω2Ω3
1(1)Γ 3(1)Γ2(1)Γ
4(1)Γ
1(2)Γ3(2)Γ2
(2)Γ
4(2)Γ3(3)Γ
1(3)Γ
2(3)Γ
[Gosselet and Rey 2006] Elimination of interior d.o.f.S(i)p =
A(i)ΓΓ − AΓiA−1ii AiΓ
42
-
GENERAL NOTATIONS
1(1)Γ 3(1)Γ2(1)Γ
4(1)Γ
1(2)Γ3(2)Γ2
(2)Γ
4(2)Γ3(3)Γ
1(3)Γ
2(3)Γ 1 3
4
5
2
Jump operators {B(i)}3i=1 Primal constraints[Mandel 1993]
42
-
GENERAL NOTATIONS
1(1)Γ 3(1)Γ2(1)Γ
4(1)Γ
1(2)Γ3(2)Γ2
(2)Γ
4(2)Γ3(3)Γ
1(3)Γ
2(3)Γ 12
4
53
6
7
Jump operators {B(i)}3i=1 Dual constraints[Farhat and Roux
1991]
42
-
CONDENSED SYSTEM
∀i ∈ J1;NK, S(i)p x(i)Γ = g(i)Γ + λ(i)Γ
R(i)ΓTλ(i)Γ = 0
N∑i=1
B(i)x(i)Γ = 0
N∑i=1
B(i)λ(i)Γ = 0
43
-
CONDENSED SYSTEM
∀i ∈ J1;NK, S(i)p x(i)Γ = g(i)Γ + λ(i)Γ
R(i)ΓTλ(i)Γ = 0
N∑i=1
B(i)x(i)Γ = 0
N∑i=1
B(i)λ(i)Γ = 0
43
-
CONDENSED SYSTEM
∀i ∈ J1;NK, S(i)p x(i)Γ = g(i)Γ + λ(i)Γ
R(i)ΓTλ(i)Γ = 0
N∑i=1
B(i)x(i)Γ = 0
N∑i=1
B(i)λ(i)Γ = 0
43
-
CONDENSED SYSTEM
∀i ∈ J1;NK, S(i)p x(i)Γ = g(i)Γ + λ(i)ΓR(i)Γ
Tλ(i)Γ = 0
N∑i=1
B(i)x(i)Γ = 0
N∑i=1
B(i)λ(i)Γ = 0
43
-
PRIMAL METHODS
Unique displacement/eliminated reactions
◦ unknown xΓ =⇒ x(i)Γ = B(i)TxΓ
◦ system of equationsN∑i=1
B(i)S(i)p B(i)TxΓ =
N∑i=1
B(i)g(i)Γ
◦ preconditioner
M−1 =N∑i=1
B(i)D(i)p S(i)p†D(i)p B(i)
T
applied to vectors in Im(Sp)
44
-
PRIMAL METHODS
Unique displacement/eliminated reactions
◦ unknown xΓ =⇒ x(i)Γ = B(i)TxΓ
◦ system of equationsN∑i=1
B(i)S(i)p B(i)TxΓ =
N∑i=1
B(i)g(i)Γ
◦ preconditioner
M−1 =N∑i=1
B(i)D(i)p S(i)p†D(i)p B(i)
T
applied to vectors in Im(Sp)
44
-
PRIMAL METHODS
Unique displacement/eliminated reactions
◦ unknown xΓ =⇒ x(i)Γ = B(i)TxΓ
◦ system of equationsN∑i=1
B(i)S(i)p B(i)TxΓ =
N∑i=1
B(i)g(i)Γ
◦ preconditioner
M−1 =N∑i=1
B(i)D(i)p S(i)p†D(i)p B(i)
T
applied to vectors in Im(Sp)
44
-
PRIMAL METHODS
Unique displacement/eliminated reactions
◦ unknown xΓ =⇒ x(i)Γ = B(i)TxΓ
◦ system of equationsN∑i=1
B(i)S(i)p B(i)TxΓ =
N∑i=1
B(i)g(i)Γ
◦ preconditioner
M−1 =N∑i=1
B(i)D(i)p S(i)p†D(i)p B(i)
T
applied to vectors in Im(Sp)
44
-
BALANCING DOMAIN DECOMPOSITION [EXAMPLE2.EDP]
Balanced residualN∑i=1
R(i)ΓTD(i)p B(i)
TrΓ = 0
Projection
RΓ =[B(1)D(1)p R(1)Γ · · · B(N)D
(N)p R(N)Γ
]Sp =
N∑i=1
B(i)S(i)p B(i)T
=⇒
RTΓSpP = 0 with P = I− RΓ(RTΓSpRΓ
)−1 RTΓSp
45
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example2.edp
-
BALANCING DOMAIN DECOMPOSITION [EXAMPLE2.EDP]
Balanced residualN∑i=1
R(i)ΓTD(i)p B(i)
TrΓ = 0
Projection
RΓ =[B(1)D(1)p R(1)Γ · · · B(N)D
(N)p R(N)Γ
]Sp =
N∑i=1
B(i)S(i)p B(i)T
=⇒
RTΓSpP = 0 with P = I− RΓ(RTΓSpRΓ
)−1 RTΓSp
45
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example2.edp
-
BALANCING DOMAIN DECOMPOSITION [EXAMPLE2.EDP]
Balanced residualN∑i=1
R(i)ΓTD(i)p B(i)
TrΓ = 0
Projection
RΓ =[B(1)D(1)p R(1)Γ · · · B(N)D
(N)p R(N)Γ
]Sp =
N∑i=1
B(i)S(i)p B(i)T
=⇒
RTΓSpP = 0 with P = I− RΓ(RTΓSpRΓ
)−1 RTΓSp45
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example2.edp
-
DUAL METHODS [EXAMPLE3.EDP]
Unique reaction/eliminated displacements
◦ unknown λΓ =⇒ λ(i)Γ = B(i)TλΓ
◦ dual Schur complements S(i)d = S(i)p
†
◦ system of equations∀i ∈ J1;NK, x(i)Γ = S(i)d (g(i)Γ + B(i)TλΓ)
+ R(i)Γ α(i)
0 = R(i)ΓT(g(i)Γ + B(i)
TλΓ)
◦ saddle-point formulation[Sd RΓRTΓ 0
][λΓα
]=
[−bd−gΓ
]
46
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example3.edp
-
DUAL METHODS [EXAMPLE3.EDP]
Unique reaction/eliminated displacements
◦ unknown λΓ =⇒ λ(i)Γ = B(i)TλΓ
◦ dual Schur complements S(i)d = S(i)p
†
◦ system of equations∀i ∈ J1;NK, x(i)Γ = S(i)d (g(i)Γ + B(i)TλΓ)
+ R(i)Γ α(i)
0 = R(i)ΓT(g(i)Γ + B(i)
TλΓ)
◦ saddle-point formulation[Sd RΓRTΓ 0
][λΓα
]=
[−bd−gΓ
]
46
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example3.edp
-
DUAL METHODS [EXAMPLE3.EDP]
Unique reaction/eliminated displacements
◦ unknown λΓ =⇒ λ(i)Γ = B(i)TλΓ
◦ dual Schur complements S(i)d = S(i)p
†
◦ system of equations∀i ∈ J1;NK, x(i)Γ = S(i)d (g(i)Γ + B(i)TλΓ)
+ R(i)Γ α(i)
0 = R(i)ΓT(g(i)Γ + B(i)
TλΓ)
◦ saddle-point formulation[Sd RΓRTΓ 0
][λΓα
]=
[−bd−gΓ
]
46
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_7/example3.edp
-
SIMILAR UTILITY ROUTINES
◦ exchange◦ statistics◦ renumber
47
-
PETSC
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8
-
INTRODUCTION
◦ Portable, Extensible Toolkit for Scientific Computation◦ suite
of data structures
• Vec and Mat• PC and KSP• SNES, TS, and Tao
◦ useful for compiling other libraries (MUMPS, hypre)◦
https://www.mcs.anl.gov/petsc◦ https://gitlab.com/petsc/petsc◦
interfaced with HPDDM
48
https://www.mcs.anl.gov/petschttps://gitlab.com/petsc/petsc
-
DATA DISTRIBUTION
Operators follow a 1D row-wise contiguous distribution
process #0
process #1
process #2
process #3
=⇒ accessible via GlobalNumbering
49
-
DATA DISTRIBUTION
Operators follow a 1D row-wise contiguous distribution
process #0
process #1
process #2
process #3
=⇒ accessible via GlobalNumbering49
-
PETSC MATRIX [EXAMPLE1.EDP]
Setting up a Mat◦ same input parameters as HPDDM types◦
simplified macro createMat(Th,A,Pk)◦ Mat for complex-valued
problems
Switching between numberings
◦ ChangeNumbering(Mat,K[int])◦ optional parameters
• inverse to go from PETSc to FreeFEM• exchange to update ghost
values
50
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example1.edp
-
PETSC MATRIX [EXAMPLE1.EDP]
Setting up a Mat◦ same input parameters as HPDDM types◦
simplified macro createMat(Th,A,Pk)◦ Mat for complex-valued
problems
Switching between numberings
◦ ChangeNumbering(Mat,K[int])◦ optional parameters
• inverse to go from PETSc to FreeFEM• exchange to update ghost
values
50
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example1.edp
-
Mat LINEAR OPERATIONS [EXAMPLE2.EDP]
Just as with FreeFEM matrix◦ matrix–vector product A * x◦ matrix
transpose–vector product A' * x◦ linear solve Aˆ-1 * x◦ transposed
linear solve A'ˆ-1 * x
◦ native operations with vectors in PETSc numbering• KSPSolve•
MatMatMult• more at Mat manual pages
51
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example2.edphttps://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/index.html
-
BASIC PRECONDITIONING [EXAMPLE3.EDP]
◦ default to BJacobi with ILU(0) as a subdomain solver◦ attach a
KSP using set and sparams◦ -ksp_type, -pc_type◦ -help generated
dynamically
Updating a Mat◦ Mat = matrix if same pattern or first update◦
Mat = Mat if different pattern
52
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example3.edp
-
DIRECT FACTORIZATIONS
◦ -pc_type [lu|cholesky]◦ -pc_factor_mat_solver_type
[mumps|superlu]◦ -help for fine-tuning a solver◦ e.g.,
-mat_mumps_icntl_4 2
53
-
SCHWARZ METHODS
◦ -pc_type [asm|gasm]◦ -pc_asm_overlap n◦ -pc_asm_type
[basic|restrict|interpolate|none]◦ -sub_pc_type, -sub_ksp_type◦
more at PCASM manual page
54
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCASM.html
-
EXAMPLE 4: LIOUVILLE–BRATU–GELFAND EQUATION
AimSolve the nonlinear PDE
∆u+ λ expu = 0 in Ωu(x, y) = cos πx cos πy on Γ
Newton method
un+1 = un + w such that F(un) + dwF(un) = 0with dwF(un) = ∆w+ λ
expu
n w
[EXAMPLE4_SEQ.EDP] + [EXAMPLE4.EDP]
55
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example4_seq.edphttp://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example4.edp
-
EXAMPLE 4: LIOUVILLE–BRATU–GELFAND EQUATION
AimSolve the nonlinear PDE
∆u+ λ expu = 0 in Ωu(x, y) = cos πx cos πy on Γ
Newton method
un+1 = un + w such that F(un) + dwF(un) = 0with dwF(un) = ∆w+ λ
expu
n w
[EXAMPLE4_SEQ.EDP] + [EXAMPLE4.EDP]
55
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example4_seq.edphttp://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example4.edp
-
BLOCK MATRICES
Limitations of monolithic formulations◦ single varf◦ single
fespace
Alternative◦ “decoupled” fespace and multiple varf◦ PETSc
MatNest to match FreeFEM block syntax
Solving linear systems with MatNest◦ no explicit representation⇒
no LU, BJacobi◦ automatic convertion or MatConvert
56
-
BLOCK MATRICES
Limitations of monolithic formulations◦ single varf◦ single
fespace
Alternative◦ “decoupled” fespace and multiple varf◦ PETSc
MatNest to match FreeFEM block syntax
Solving linear systems with MatNest◦ no explicit representation⇒
no LU, BJacobi◦ automatic convertion or MatConvert
56
-
BLOCK MATRICES
Limitations of monolithic formulations◦ single varf◦ single
fespace
Alternative◦ “decoupled” fespace and multiple varf◦ PETSc
MatNest to match FreeFEM block syntax
Solving linear systems with MatNest◦ no explicit representation⇒
no LU, BJacobi◦ automatic convertion or MatConvert
56
-
EXAMPLE 5: POISSON EQ. WITH NEUMANN BC [EXAMPLE5.EDP]
Aim◦ indefinite system with no essential BC
◦ additional constraint∫Ω
u = 0
[A ccT 0
][uλ
]=
[b0
]◦ written as Mat N = [[A,c],[c',0]];◦ A is a Mat and c follows
PETSc numbering◦ use either ˆ-1 or MatConvert + KSPSolve◦ λ stored
only on the process with the lowest rank
57
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example5.edp
-
EXAMPLE 5: POISSON EQ. WITH NEUMANN BC [EXAMPLE5.EDP]
Aim◦ indefinite system with no essential BC
◦ additional constraint∫Ω
u = 0
[A ccT 0
][uλ
]=
[b0
]◦ written as Mat N = [[A,c],[c',0]];◦ A is a Mat and c follows
PETSc numbering◦ use either ˆ-1 or MatConvert + KSPSolve◦ λ stored
only on the process with the lowest rank
57
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example5.edp
-
RECTANGULAR MATRICES [EXAMPLE6.EDP]
◦ some FreeFEM matrix are not square◦ matrix Loc = varf(Ph, Vh);
// Loc.n = Vh.ndof◦ coupling 2D and 3D problems
fespaces defined with the same partitioning◦ one square Mat A
for distributing Vh◦ one square Mat B for distributing Ph◦ one Mat
C(A,B,Loc);
=⇒ buildDmesh(Th) + createMat(A|B)
58
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example6.edp
-
RECTANGULAR MATRICES [EXAMPLE6.EDP]
◦ some FreeFEM matrix are not square◦ matrix Loc = varf(Ph, Vh);
// Loc.n = Vh.ndof◦ coupling 2D and 3D problems
fespaces defined with the same partitioning◦ one square Mat A
for distributing Vh◦ one square Mat B for distributing Ph◦ one Mat
C(A,B,Loc);
=⇒ buildDmesh(Th) + createMat(A|B)
58
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example6.edp
-
EXAMPLE 7: STOKES EQUATION [EXAMPLE7.EDP]
AimSolve with a Poiseuille inflow the system
−∆u+∇p = 0 in Ω∇ · u = 0
◦ two varfs◦ two fespace distributions◦ two assembled Mats A and
B◦ coupled system Mat N = [[A,B],[B',0]];◦ transposed operators not
formed explicitly
59
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example7.edp
-
EXAMPLE 7: STOKES EQUATION [EXAMPLE7.EDP]
AimSolve with a Poiseuille inflow the system
−∆u+∇p = 0 in Ω∇ · u = 0
◦ two varfs◦ two fespace distributions◦ two assembled Mats A and
B◦ coupled system Mat N = [[A,B],[B',0]];◦ transposed operators not
formed explicitly
59
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example7.edp
-
MULTIGRID METHODS
◦ point-wise aggregation• hypre (--download-hypre) [Falgout and
Yang 2002]• AMS [Hiptmair and Xu 2007]
◦ smoothed aggregation• GAMG [Adams et al. 2004]• ML
(--download-ml) [Gee et al. 2006]
Systems of equations◦ matrix block size◦ MatNullSpace, e.g.,
rigid body modes
60
-
EXAMPLE 8: SYSTEM OF LINEAR ELASTICITY [EXAMPLE8.EDP]
AimSolve the system
−div σ = f in Ωu = 0 on ΓD
Tools◦ vectorial fespace◦ block size of three◦ MatNullSpace with
the rigid body modes
• translations (1, 0, 0) (0, 1, 0) (0, 0, 1)• rotations (y,−x,
0) (−z, 0, x) (0, z,−y)
61
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example8.edp
-
EXAMPLE 8: SYSTEM OF LINEAR ELASTICITY [EXAMPLE8.EDP]
AimSolve the system
−div σ = f in Ωu = 0 on ΓD
Tools◦ vectorial fespace◦ block size of three◦ MatNullSpace with
the rigid body modes
• translations (1, 0, 0) (0, 1, 0) (0, 0, 1)• rotations (y,−x,
0) (−z, 0, x) (0, z,−y)
61
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example8.edp
-
EXAMPLE 9: GEOMETRIC MULTIGRID [EXAMPLE9.EDP]
AimSolve the complex-valued system
curl curl E− k2E = f in Ω(curl E)× n− ik(E× n)× n = 0 on ΓR
Tools◦ Nédélec edge elements◦ Schwarz smoothers◦
buildMatEdgeRecursive◦ more at PCMG manual page
62
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example9.edphttps://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMG.html
-
EXAMPLE 9: GEOMETRIC MULTIGRID [EXAMPLE9.EDP]
AimSolve the complex-valued system
curl curl E− k2E = f in Ω(curl E)× n− ik(E× n)× n = 0 on ΓR
Tools◦ Nédélec edge elements◦ Schwarz smoothers◦
buildMatEdgeRecursive◦ more at PCMG manual page
62
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example9.edphttps://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMG.html
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pq,Pt) ⇒ Vh [u,v,w] =
[1,2,3]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[ ]-ksp_type gmres
-fieldsplit_velocity_pc_type
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[C−1 00 I
] -ksp_type gmres -pc_fieldsplit_type
additive-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_pc_type jacobi
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[kspC 00 I
] -ksp_type fgmres -pc_fieldsplit_type
additive-fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_pc_type jacobi
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[kspC B0 I
] -ksp_type fgmres -pc_fieldsplit_type
multiplicative-fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_pc_type jacobi
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[kspC 00 −kspS
] -ksp_type fgmres -pc_fieldsplit_type
schur-fieldsplit_velocity_pc_type gamg-fieldsplit_pressure_pc_type
jacobi
-pc_fieldsplit_schur_factorization_type diag
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[kspC 0BT kspS
] -ksp_type fgmres -pc_fieldsplit_type
schur-fieldsplit_velocity_pc_type gamg-fieldsplit_pressure_pc_type
jacobi
-pc_fieldsplit_schur_factorization_type lower
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
FIELDSPLIT PRECONDITIONERS [EXAMPLE10.EDP]
◦ separate preconditioners for each field◦ unknowns are
interleaved◦ underlying IS◦ fespace Vh(Pk,Pk,Pq) ⇒ Vh [u,v,p] =
[1,1,2]◦ new option prefixes -fieldsplit_%d_
Optional parameters
◦ prefixes customizable with a string[int]◦ approximate Schur
complement
Examples for Stokes equations [Knepley 2013]
A =[C BBT 0
]≈ pcA=
[C−1 B0 kspS
] -ksp_type gmres -fieldsplit_pressure_ksp_max_its
1-pc_fieldsplit_type schur -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_type
richardson-pc_fieldsplit_schur_factorization_type upper
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example10.edp
-
MATRIX-FREE OPERATORS
MatShell◦ user-provided routines (MatMult, MatMultTranspose)◦
(user-provided) preconditioning less intuitive◦ func passed to
PETSc
◦ must use PETSc numbering◦ KSP is oblivious to a Mat type◦
underlying PCSHELL
64
-
MATRIX-FREE OPERATORS
MatShell◦ user-provided routines (MatMult, MatMultTranspose)◦
(user-provided) preconditioning less intuitive◦ func passed to
PETSc
◦ must use PETSc numbering◦ KSP is oblivious to a Mat type◦
underlying PCSHELL
64
-
EXAMPLE 11: 1D FINITE DIFFERENCES [EXAMPLE11.EDP]
AimSecond-order centered scheme
n2
2 −1−1 2 −1
. . . . . . . . .−1 2
x =11...1
◦ useful monitoring, e.g., -ksp_view_singularvalues◦ variable
precon to supply a PCSHELL◦ 1D meshL
65
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example11.edp
-
EXAMPLE 11: 1D FINITE DIFFERENCES [EXAMPLE11.EDP]
AimSecond-order centered scheme
n2
2 −1−1 2 −1
. . . . . . . . .−1 2
x =11...1
◦ useful monitoring, e.g., -ksp_view_singularvalues◦ variable
precon to supply a PCSHELL◦ 1D meshL
65
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example11.edp
-
NONLINEAR SOLVERS SNES
◦ easy-to-use interface to (quasi-)Newton methods◦ solve F(u) =
b◦ handle variational inequalities on u◦ must provide two funcs
• to evaluate residuals stored as K[int]• to update a Jacobian
stored as Mat
◦ ChangeNumbering to go from PETSc to FreeFEM vectors
⇒ solver for the linearized systems configured via set
66
-
NONLINEAR SOLVERS SNES
◦ easy-to-use interface to (quasi-)Newton methods◦ solve F(u) =
b◦ handle variational inequalities on u◦ must provide two funcs
• to evaluate residuals stored as K[int]• to update a Jacobian
stored as Mat
◦ ChangeNumbering to go from PETSc to FreeFEM vectors
⇒ solver for the linearized systems configured via set
66
-
EXAMPLE 12: NEWTON METHOD [EXAMPLE12.EDP]
AimSolve
∇J(u) = 0 in Ω
with J(u) =∫Ω
12 f
(||∇u||2
)− cu
subject to ulower ⩽ u ⩽ uupper in Cand f : x 7→ (1+ a)x− log(1+
x)
◦ compute constant term from F(u) = b◦ assemble residuals and
update the Jacobian◦ additional parameters in SNESSolve for the
bounds◦ no automatic differentiation
67
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example12.edp
-
EXAMPLE 12: NEWTON METHOD [EXAMPLE12.EDP]
AimSolve
∇J(u) = 0 in Ω
with J(u) =∫Ω
12 f
(||∇u||2
)− cu
subject to ulower ⩽ u ⩽ uupper in Cand f : x 7→ (1+ a)x− log(1+
x)
◦ compute constant term from F(u) = b◦ assemble residuals and
update the Jacobian◦ additional parameters in SNESSolve for the
bounds◦ no automatic differentiation
67
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example12.edp
-
EXAMPLE 13: SOLUTION RECONSTRUCTION [EXAMPLE13.EDP]
AimDistributed local functions to a global function◦ mesh
adaptation◦ centralized postprocessing
◦ buildDmesh additional macro N2O◦ + restrict to go from local
to global◦ global reduction needed to sum contributions
68
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example13.edp
-
EXAMPLE 13: SOLUTION RECONSTRUCTION [EXAMPLE13.EDP]
AimDistributed local functions to a global function◦ mesh
adaptation◦ centralized postprocessing
◦ buildDmesh additional macro N2O◦ + restrict to go from local
to global◦ global reduction needed to sum contributions
68
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example13.edp
-
EXAMPLE 14: SYSTEMS WITH MULTIPLE RHS [EXAMPLE14.EDP]
AimSolve the complex-valued system
∆u+ k2u = fi in Ωu · n = 0 on Γ
with multiple point sources {fi}i=1,2,...
◦ KSPSolve + complex[int,int]◦ use PETSc numbering◦ single solve
with an exact factorization◦ block GMRES from HPDDM
69
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example14.edp
-
EXAMPLE 14: SYSTEMS WITH MULTIPLE RHS [EXAMPLE14.EDP]
AimSolve the complex-valued system
∆u+ k2u = fi in Ωu · n = 0 on Γ
with multiple point sources {fi}i=1,2,...
◦ KSPSolve + complex[int,int]◦ use PETSc numbering◦ single solve
with an exact factorization◦ block GMRES from HPDDM
69
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_8/example14.edp
-
END-USER PERSPECTIVES
◦ timesteppers TS manual pages◦ optimizers Tao manual pages◦
easy composability between PETSc objects
70
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/index.htmlhttps://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Tao/index.html
-
SLEPC
[LINK TO THE EXAMPLES]
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_9
-
INTRODUCTION
◦ Scalable Library for Eigenvalue Problem Computations◦ built on
top of PETSc◦ suite of additional data structures
• EPS• ST
◦ useful for compiling other libraries (ARPACK)◦
http://slepc.upv.es◦ https://gitlab.com/slepc/slepc
71
http://slepc.upv.eshttps://gitlab.com/slepc/slepc
-
INTERFACE
Parameters◦ one Mat or two for generalized eigenproblems◦ arrays
for eigenvalues/eigenvectors
• FreeFEM numbering• PETSc numbering for MatNest
◦ careful with --with-scalar-type
Spectral transformations◦ improve convergence for some
eigenproblems◦ underlying KSP configured via set
72
-
INTERFACE
Parameters◦ one Mat or two for generalized eigenproblems◦ arrays
for eigenvalues/eigenvectors
• FreeFEM numbering• PETSc numbering for MatNest
◦ careful with --with-scalar-type
Spectral transformations◦ improve convergence for some
eigenproblems◦ underlying KSP configured via set
72
-
INTERFACE
Parameters◦ one Mat or two for generalized eigenproblems◦ arrays
for eigenvalues/eigenvectors
• FreeFEM numbering• PETSc numbering for MatNest
◦ careful with --with-scalar-type
Spectral transformations◦ improve convergence for some
eigenproblems◦ underlying KSP configured via set
72
-
MISCELLANEOUS
Matrix-free operators◦ just as with KSPSolve with a MatShell◦
PETSc numbering◦ limited number of ST
Periodic boundary conditionsCustom partitioning that may be
imbalanced
HPDDM Krylov methods◦ -ksp_type hpddm◦ -st_ksp_type hpddm
73
-
MISCELLANEOUS
Matrix-free operators◦ just as with KSPSolve with a MatShell◦
PETSc numbering◦ limited number of ST
Periodic boundary conditionsCustom partitioning that may be
imbalanced
HPDDM Krylov methods◦ -ksp_type hpddm◦ -st_ksp_type hpddm
73
-
MISCELLANEOUS
Matrix-free operators◦ just as with KSPSolve with a MatShell◦
PETSc numbering◦ limited number of ST
Periodic boundary conditionsCustom partitioning that may be
imbalanced
HPDDM Krylov methods◦ -ksp_type hpddm◦ -st_ksp_type hpddm
73
-
EXAMPLE 1: STEKLOV–POINCARÉ OPERATOR [EXAMPLE1.EDP]
AimFind the eigenvectors of the operator
DtN : ΓN → Rg 7→ ∂nv
where v satisfies −∆v = 0 in Ωv = 0 on ΓDv = g on ΓN
◦ equivalent to finding (λ,u) s.t. Au = λBu◦ shift-and-invert
spectral transformation◦ same Mat distribution for A and B using
Mat B(A,Loc);
74
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_9/example1.edp
-
EXAMPLE 1: STEKLOV–POINCARÉ OPERATOR [EXAMPLE1.EDP]
AimFind the eigenvectors of the operator
DtN : ΓN → Rg 7→ ∂nv
where v satisfies −∆v = 0 in Ωv = 0 on ΓDv = g on ΓN
◦ equivalent to finding (λ,u) s.t. Au = λBu◦ shift-and-invert
spectral transformation◦ same Mat distribution for A and B using
Mat B(A,Loc);
74
http://jolivet.perso.enseeiht.fr/FreeFem-tutorial/section_9/example1.edp
-
APPLICATIONS
-
AUGMENTED LAGRANGIAN FOR STABILITY ANALYSIS
[Moulin et al. 2019]
◦ https://github.com/prj-/moulin2019al◦ Nonlinear-solver.edp
• -gamma (0.1)• -mesh (FlatPlate3D.mesh)
• -Re (50)
◦ Eigensolver.edp• -shift_real (10−6)• -shift_imag (0.6)
• -nev (5)• -recycle (0)
75
https://github.com/prj-/moulin2019al
-
BASEFLOW
76
-
LEADING UNSTABLE EIGENVECTOR
77
-
THANK YOU!
QUESTIONS?
77
-
OPTIMAL COMPILATION PROCESS I GENERAL VARIABLES
◦ remove any previous instance of FreeFEM◦ install gcc/clang and
gfortran◦ make sure you have a working MPI implementation>
export FF_DIR=${PWD}/FreeFem-sources> export
PETSC_DIR=${PWD}/petsc> export PETSC_ARCH=arch-FreeFem>
export PETSC_VAR=${PETSC_DIR}/${PETSC_ARCH}
1
-
OPTIMAL COMPILATION PROCESS II CLONING THE REPOSITORIES
> git clone https://github.com/FreeFem/FreeFem-sources>
git clone https://gitlab.com/petsc/petsc
1
-
OPTIMAL COMPILATION PROCESS III COMPILATION FOR REAL SCALARS
> cd ${PETSC_DIR} && ./configure
--download-mumps--download-parmetis
--download-metis--download-hypre --download-superlu--download-slepc
--download-hpddm--download-ptscotch
--download-suitesparse--download-scalapack
--download-tetgen--with-fortran-bindings=no
--with-scalar-type=real--with-debugging=no
> make
1
-
OPTIMAL COMPILATION PROCESS IV COMPILATION FOR COMPLEX SCALARS,
OPTIONAL
> export PETSC_ARCH=arch-FreeFem-complex> ./configure
--with-mumps-dir=arch-FreeFem--with-parmetis-dir=arch-FreeFem--with-metis-dir=arch-FreeFem--with-superlu-dir=arch-FreeFem
--download-slepc--download-hpddm--with-ptscotch-dir=arch-FreeFem--with-suitesparse-dir=arch-FreeFem--with-scalapack-dir=arch-FreeFem--with-tetgen-dir=arch-FreeFem--with-fortran-bindings=no--with-scalar-type=complex
--with-debugging=no
> make1
-
OPTIMAL COMPILATION PROCESS V FINAL COMPILATION
> cd ${FF_DIR}> git checkout develop> autoreconf -i>
./configure --without-hdf5
--disable-iohdf5--with-petsc=${PETSC_VAR}/lib--with-petsc_complex=${PETSC_VAR}-complex/lib
◦ if PETSc is not detected, overwrite MPIRUN> make -j4>
export PATH=${PATH}:${FF_DIR}/src/mpi> export
PATH=${PATH}:${FF_DIR}/src/nw◦ setup ~/.freefem++.pref or define
FF_INCLUDEPATHand FF_LOADPATH
1
-
REFERENCES I
Adams, Mark, Harun H. Bayraktar, Tony M. Keaveny, and Panayiotis
Papadopoulos(2004). “Ultrascalable Implicit Finite Element Analyses
in Solid Mechanics with overa Half a Billion Degrees of Freedom”.
In: Proceedings of the 2004 ACM/IEEEConference on Supercomputing.
SC04. IEEE Computer Society, 34:1–34:15.
Al Daas, Hussam, Laura Grigori, Pierre Jolivet, and Pierre-Henri
Tournier (2019). “AMultilevel Schwarz Preconditioner Based on a
Hierarchy of Robust Coarse Spaces”.In: Journal of Scientific
Computing, submitted for publication.
URL:https://github.com/prj-/aldaas2019multi.
Amestoy, Patrick, Iain Duff, Jean-Yves L’Excellent, and Jacko
Koster (2001). “A fullyasynchronous multifrontal solver using
distributed dynamic scheduling”. In: SIAMJournal on Matrix Analysis
and Applications 23.1, pp. 15–41. URL:http://mumps.enseeiht.fr.
Balay, Satish, William D. Gropp, Lois Curfman McInnes, and Barry
F. Smith (1997).“Efficient management of parallelism in
object-oriented numerical softwarelibraries”. In: Modern Software
Tools in Scientific Computing, pp. 163–202.
2
https://github.com/prj-/aldaas2019multihttp://mumps.enseeiht.fr
-
REFERENCES II
Cai, Xiao-Chuan, Maksymilian Dryja, and Marcus Sarkis (2003).
“Restricted additiveSchwarz preconditioners with harmonic overlap
for symmetric positive definitelinear systems”. In: SIAM Journal on
Numerical Analysis 41.4, pp. 1209–1231.
St-Cyr, Amik, Martin Gander, and Stephen Thomas (2007).
“Optimized multiplicative,additive, and restricted additive Schwarz
preconditioning”. In: SIAM Journal onScientific Computing 29.6, pp.
2402–2425.
Falgout, Robert and Ulrike Yang (2002). “hypre: A library of
high performancepreconditioners”. In: Computational Science—ICCS
2002, pp. 632–641.
Farhat, Charbel and François-Xavier Roux (1991). “A method of
finite element tearingand interconnecting and its parallel solution
algorithm”. In: International Journalfor Numerical Methods in
Engineering 32.6, pp. 1205–1227.
Gee, Michael W., Christopher M. Siefert, Jonathan J. Hu, Ray S.
Tuminaro, andMarzio G. Sala (2006). ML 5.0 Smoothed Aggregation
User’s Guide. Tech. rep.SAND2006-2649. Sandia National
Laboratories. URL:https://trilinos.github.io/ml.html.
3
https://trilinos.github.io/ml.html
-
REFERENCES III
Geuzaine, Christophe and Jean-François Remacle (2009). “Gmsh: A
3-D finite elementmesh generator with built-in pre- and
post-processing facilities”. In: InternationalJournal for Numerical
Methods in Engineering 79.11, pp. 1309–1331.
URL:http://geuz.org/gmsh.
Gosselet, Pierre and Christian Rey (2006). “Non-overlapping
domain decompositionmethods in structural mechanics”. In: Archives
of Computational Methods inEngineering 13.4, pp. 515–572.
Haferssas, Ryadh, Pierre Jolivet, and Frédéric Nataf (2017). “An
additive Schwarzmethod type theory for Lions’s algorithm and a
Symmetrized Optimized RestrictedAdditive Schwarz Method”. In:
Journal on Scientific Computing 39.4, A1345–A1365.
Hecht, Frédéric (2012). “New development in FreeFem++”. In:
Journal of NumericalMathematics 20.3-4, pp. 251–266.
Hénon, Pascal, Pierre Ramet, and Jean Roman (2002). “PaStiX: a
high-performanceparallel direct solver for sparse symmetric
positive definite systems”. In: ParallelComputing 28.2, pp.
301–321. URL: http://pastix.gforge.inria.fr.
4
http://geuz.org/gmshhttp://pastix.gforge.inria.fr
-
REFERENCES IV
Hernandez, Vicente, Jose E. Roman, and Vicente Vidal (2005).
“SLEPc: A scalable andflexible toolkit for the solution of
eigenvalue problems”. In: ACM Transactions onMathematical Software
31.3, pp. 351–362. URL: https://slepc.upv.es.
Hiptmair, Ralf and Jinchao Xu (2007). “Nodal auxiliary space
preconditioning in H(curl)and H(div) spaces”. In: SIAM Journal on
Numerical Analysis 45.6, pp. 2483–2509.
Jolivet, Pierre, Frédéric Hecht, Frédéric Nataf, and Christophe
Prud’homme (2013).“Scalable Domain Decomposition Preconditioners
for Heterogeneous EllipticProblems”. In: Proceedings of the
International Conference on High PerformanceComputing, Networking,
Storage and Analysis, SC13. ACM.
Jolivet, Pierre and Pierre-Henri Tournier (2016). “Block
Iterative Methods and Recyclingfor Improved Scalability of Linear
Solvers”. In: Proceedings of the 2016 InternationalConference for
High Performance Computing, Networking, Storage and Analysis.SC16.
IEEE.
Knepley, Matthew (2013). Nested and Hierarchical Solvers in
PETSc.
URL:https://www.caam.rice.edu/~mk51/presentations/SIAMCSE13.pdf.
5
https://slepc.upv.eshttps://www.caam.rice.edu/~mk51/presentations/SIAMCSE13.pdf
-
REFERENCES V
Li, Xiaoye (2005). “An Overview of SuperLU: Algorithms,
Implementation, and UserInterface”. In: ACM Transactions on
Mathematical Software 31.3, pp. 302–325.
URL:http://crd-legacy.lbl.gov/~xiaoye/SuperLU.
Mandel, Jan (1993). “Balancing domain decomposition”. In:
Communications inNumerical Methods in Engineering 9.3, pp.
233–241.
Moulin, Johann, Pierre Jolivet, and Olivier Marquet (2019).
“Augmented LagrangianPreconditioner for Large-Scale Hydrodynamic
Stability Analysis”. In: ComputerMethods in Applied Mechanics and
Engineering 351, pp. 718–743.
URL:https://github.com/prj-/moulin2019al.
Schwarz, Hermann (1870). “Über einen Grenzübergang durch
alternierendes Verfahren”.In: Vierteljahrsschrift der
Naturforschenden Gesellschaft in Zürich 15, pp. 272–286.
Si, Hang (2013). TetGen: A Quality Tetrahedral Mesh Generator
and 3D DelaunayTriangulator. Tech. rep. 13. URL:
http://wias-berlin.de/software/tetgen.
6
http://crd-legacy.lbl.gov/~xiaoye/SuperLUhttps://github.com/prj-/moulin2019alhttp://wias-berlin.de/software/tetgen
-
REFERENCES VI
Spillane, Nicole, Victorita Dolean, Patrice Hauret, Frédéric
Nataf, Clemens Pechstein,and Robert Scheichl (2013). “Abstract
robust coarse spaces for systems of PDEs viageneralized
eigenproblems in the overlaps”. In: Numerische Mathematik 126.4,pp.
741–700.
Suzuki, Atsushi and François-Xavier Roux (2014). “A dissection
solver with kerneldetection for symmetric finite element matrices
on shared memory computers”. In:International Journal for Numerical
Methods in Engineering 100.2, pp. 136–164.
7
-
INDEX I
-D option to predefine a macro. 38-help PETSc help. 143,
144-ksp_type PETSc KSP type. 143, 187–189-ns no script. 27-nw no
window. 27-pc_type PETSc PC type. 143–145-v verbosity. 27-wg with
graphics. 27.m number of columns of arrays. 35.n leading dimensions
of arrays. 35.ndof number of degrees of freedom of a fespace. 29,
30.resize method to resize arrays. 35
AttachCoarseOperator setup a multilevel domain decomposition
preconditioner. 93–108
buildDmesh distribute a global mesh. 153, 154, 177,
178buildMatEdgeRecursive create a hierarchy of Mats with respective
prolongation operators.
160, 161
8
-
INDEX II
buildmesh unstructured 2D mesh. 28
ChangeNumbering switch between FreeFEM and PETSc numberings.
140, 141, 173, 174ChangeOperator change the numerical values of a
Mat. 92complex[int,int] complex-valued 2D array. 179,
180complex[int] complex-valued 1D array. 35createMat create a
distributed Mat. 140, 141, 153, 154cube structured 3D cube. 28
EigenValue eigenvalue problem solver using a reverse
communication inter-face. 37
EPS SLEPc eigenvalue problem solver. 183exchange communicate
values associated to duplicated unknowns. 92, 135
fespace finite element space. 29, 30, 33, 34, 39, 148–150,
153–156, 158, 159,162–168, 208, 213
func function. 37, 169, 170, 173, 174
9
-
INDEX III
GlobalNumbering generate a global numbering of the unknowns.
138, 139
IFMACRO conditional statement using a macro. 38IS PETSc index
set. 162–168
KSP PETSc Krylov subspace method. 137, 143, 169, 170, 184–186,
208KSPSolve PETSc linear solve. 142, 151, 152, 179, 180,
187–189
LinearCG conjugate gradient using a reverse communication
interface. 37load instruction to load additional plugins. 36
macro definition of macro. 38, 208Mat PETSc matrix. 137, 142,
169, 170, 208, 209, 211Mat distributed real-valued matrix wrapping
a PETSc matrix. 140–143,
153–156, 173, 174, 184–186, 190, 191, 209, 213Mat distributed
complex-valued matrix wrapping a PETSc matrix. 140,
141
10
-
INDEX IV
MatConvert convert a MatNest into a more standard format.
148–152MatMatMult PETSc matrix–matrix product. 142MatMult PETSc
matrix–vector product. 169, 170MatMultTranspose PETSc matrix
transpose–vector product. 169, 170MatNest PETSc format for storing
nested submatrices stored indepen-
dently. 148–150, 184–186, 211MatNullSpace Mat that removes a
null space from a vector. 157–159matrix real-valued sparse matrix.
33, 34, 66, 142, 143, 153, 154, 213matrix[int] array of sparse
matrices. 35MatShell PETSc format for user-defined matrix types.
169, 170, 187–189medit 3D visualization. 28mesh 2D mesh. 28–30,
66mesh3 3D mesh. 28meshL 1D mesh. 171, 172mpiComm MPI communicator.
66mpiRequest MPI request. 66
11
-
INDEX V
new2old numbering to go from a new to an old mess when using
trunc.39, 213
OMP_NUM_THREADS number of OpenMP threads. 57on impose Dirichlet
boundary conditions on given labels. 31, 32
PC PETSc preconditioner. 137, 208PCASM PETSc Schwarz methods.
145PCMG PETSc multigrid methods. 160, 161PCSHELL PETSc abstract
preconditioner. 169–172plot basic visualization. 28
qforder order of numerical integration. 31, 32
real[int,int] real-valued 2D array. 35real[int] real-valued 1D
array. 35real[int][int] real-valued array of arrays. 35
12
-
INDEX VI
renumber permute amatrix so that its interior unknowns are
numbered first.135
restrict numbering to go from a new to an old fespace when
usingnew2old. 39, 177, 178
savevtk export data to ParaView. 71set supply options to a
matrix or Mat. 33, 34, 143, 173, 174, 184–186SNES PETSc nonlinear
solver. 137, 173, 174SNESSolve solve a nonlinear system of
equations. 175, 176square structured 2D square. 28ST SLEPc spectral
transformation. 183, 187–189statistics print domain decomposition
statistics. 92, 135string[int] array of strings. 35, 162–168sym
Boolean flag for assembling a symmetric matrix. 33, 34
Tao PETSc optimizer. 137, 181tgv value for imposing essential
boundary conditions. 33, 34
13
-
INDEX VII
trunc filter a mesh using a Boolean expression. 39, 212TS PETSc
timestepper. 137, 181
varf variational formulation. 31–34, 148–150, 155, 156Vec PETSc
vector. 137
14
IntroductionFinite elementsFreeFEMShared-memory
parallelismDistributed-memory parallelismOverlapping Schwarz
methodsSubstructuring methodsPETScSLEPcApplicationsAppendix