Parallel preconditioners for problems arising from whole ... · Parallel preconditioners for problems arising from whole-microwave system modeling for brain imaging Pierre-Henri Tournier
Post on 10-Oct-2020
1 Views
Preview:
Transcript
Parallel preconditioners for problems arising fromwhole-microwave system modeling
for brain imaging
Pierre-Henri Tournier 1 Pierre Jolivet 2 Marcella Bonazzoli 3
Victorita Dolean 3,4 Frederic Hecht 1 Frederic Nataf 1
1LJLL, Universite Pierre et Marie Curie, INRIA equipe ALPINES, Paris2IRIT-CNRS, Toulouse
3LJAD, Universite de Nice Sophia Antipolis, Nice4Dept. of Mathematics and Statistics, University of Strathclyde, Glasgow, UK
Numerical methods for wave propagation and applications
August 31, 2017
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 1/ 26
Motivation2 types of cerebro vascular accidents (strokes):
ischemic (85 %) hemorrhagic (15 %)
The correct treatment depends on the type of stroke:
=⇒ restore blood flow =⇒ lower blood pressure
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 2/ 26
MotivationIn order to differentiate between ischemic and hemorrhagic stroke,CT scan or MRI is typically used.Microwave tomography is a novel and promising imaging technique,especially for medical and brain imaging.
CT scan MRI microwave tomographyresolution excellent excellent good
fast 7 7 3
mobile ∼ 7 3
cost ∼ 300 000 e ∼ 1 000 000 e < 100 000 esafe 7 3 3
monitoring 7 7 3
Diagnosing a stroke at the earliest possible stage is crucial for allfollowing therapeutic decisions.Monitoring: Clinicians wish to have an image every fifteen minutes.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 3/ 26
MotivationEMTensor GmbH, Vienna, Austria.
First-generation prototype: cylindrical chamber composed of 5 ringsof 32 antennas (ceramic-loaded waveguides).
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 4/ 26
The direct problem
We consider in Ω a linear, isotropic, non-magnetic, dispersive,dissipative dielectric material.The direct problem consists in finding the electromagnetic fielddistribution in the whole chamber, given a known material andtransmitted signal.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 5/ 26
The direct problem
For each of the 5 × 32 antennas, the as-sociated electric field Ej is the solution ofMaxwell’s equations:
(1)
∇× (∇× Ej)− µ0(ω2ε+ iωσ
)Ej = 0 in Ω,
Ej × n = 0 on Γmetal,
(∇× Ej)× n + iβ(Ej × n)× n = gj on Γj ,
(∇× Ej)× n + iβ(Ej × n)× n = 0 on Γi , i 6= j ,
where µ0 is the permeability of free space, ω is the incident angularfrequency , β is the wavenumber of the waveguide, ε > 0 is thedielectric permittivity, σ > 0 is the conductivity and gj correspondsto the excitation.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 6/ 26
The direct problemSpatial discretization using Nedelec edge finite elements yields alarge sparse linear system Au = fj for each transmitting antenna j .We need a robust and efficient solver for second order time-harmonicMaxwell’s equations with heterogeneous coefficients.
=⇒ Use domain decomposition methods to produce parallelpreconditioners for the GMRES algorithm.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 7/ 26
Overlapping domain decomposition methods
Ω
Consider the linear system: Au = f ∈ Cn.
Given a decomposition of J1; nK, (N1,N2), define:I the restriction operator Ri from J1; nK into Ni ,I RT
i as the extension by 0 from Ni into J1; nK.
Then solve concurrently:
um+11 = um
1 + A−111 R1(f − Aum) um+1
2 = um2 + A−1
22 R2(f − Aum)
where ui = Riu and Aij := RiARTj .
[Schwarz 1870]
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 8/ 26
Overlapping domain decomposition methods
Ω2Ω1
Consider the linear system: Au = f ∈ Cn.Given a decomposition of J1; nK, (N1,N2), define:I the restriction operator Ri from J1; nK into Ni ,I RT
i as the extension by 0 from Ni into J1; nK.
Then solve concurrently:
um+11 = um
1 + A−111 R1(f − Aum) um+1
2 = um2 + A−1
22 R2(f − Aum)
where ui = Riu and Aij := RiARTj .
[Schwarz 1870]
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 8/ 26
Overlapping domain decomposition methodsDuplicated unknowns coupled via a partition of unity:
I =N∑
i=1RT
i DiRi .
To solve Au = f Schwarz methods can be viewed as preconditionersfor a fixed point algorithm:
un+1 = un + M−1(f − Aun).
I M−1RAS :=
N∑i=1
RTi DiA−1
i Ri with Ai = RiARTi
I M−1ORAS :=
N∑i=1
RTi DiB−1
i Ri Optimized transmission conditions[B. Despres 1991] for Helmholtz
12
1
12 1
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 9/ 26
HPDDMHPDDM is an efficient parallel implementation of domaindecomposition methods
by Pierre Jolivet and Frederic NatafI header-only library written in C++11 with MPI and OpenMPI interfaced with the open source finite element software
FreeFem++ (Frederic Hecht)
512 1,024 2,048 4,096
500200
50
10
(54)
(61)(73) (94)
# of subdomains
Tim
eto
solu
tion
(inse
cond
s)
Setup Solve Ideal
Strong scalability test for Maxwell 3D with edge ele-ments of degree 2 - 119M d.o.f. - Curie (TGCC, CEA)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 10/ 26
Comparison with experimentsThe experimental measurements obtained from the antennas are thereflection and transmission coefficients. Their numericalcounterparts are computed as
Sij =∫
ΓiEj · E0
i dγ∫Γi|E0
i |2dγ, for i , j = 1, ..., 160,
where E0i is the fundamental mode of the waveguide i .
-10
0
10
20
30
40
50
60
70
-200 -150 -100 -50 0 50 100 150 200
mag
nitu
de (
dB)
angle (degree)
ring 3 - empty
simulationexperiment
-200
-150
-100
-50
0
50
100
150
200
-200 -150 -100 -50 0 50 100 150 200
phas
e (d
egre
e)
angle (degree)
ring 3 - empty
simulationexperiment
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 11/ 26
The inverse problem
The inverse problem consists in recovering ε and σ such that foreach transmitting antenna j , the solution Ej to the associatedMaxwell’s problem matches the measurements:∫
ΓiEj · E0
i dγ∫Γi|E0
i |2dγ= Smes
ij for each receiving antenna i .
Difficulties:I inverse problems are ill-posedI noise in the experimental dataI solving the inverse problem means solving the direct problem
multiple times =⇒ time-consuming
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 12/ 26
The inverse problemLet κ := µ0(ω2ε+ iωσ) be the unknown parameter of our inverseproblem. Solving the inverse problem corresponds to minimizing thefollowing cost functional:
J(κ) =12
160∑j=1
160∑i=1
∣∣∣Sij(κ)− Smesij
∣∣∣2
=12
160∑j=1
160∑i=1
∣∣∣∣∣∫
ΓiEj(κ) · E0
i dγ∫Γi|E0
i |2dγ− Smes
ij
∣∣∣∣∣2
.
Sij(κ) depends on the solution Ej(κ) to∇× (∇× Ej)− κEj = 0 in Ω,
Ej × n = 0 on Γmetal,
(∇× Ej)× n + iβ(Ej × n)× n = gj on Γj ,
(∇× Ej)× n + iβ(Ej × n)× n = 0 on Γi , i 6= j .
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 13/ 26
The inverse problem
For j = 1, ..., 160, we introduce the adjoint problem
∇× (∇× Fj)− κFj = 0 in Ω,Fj × n = 0 on Γmetal,
(∇× Fj)× n + iβ(Fj × n)× n =(Sij(κ)− Smes
ij )∫Γi|E0
i |2dγE0
i on Γi ,i = 1, ..., 160.
We have
DJ(κ, δκ) =160∑j=1<[∫
ΩδκEj · Fjdx
].
We can then compute the gradient to use in a gradient-basedoptimization algorithm.Here we use a limited-memory BFGS algorithm.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 14/ 26
Numerical experiment - hemorrhagic stroke
I Brain model from X-ray and MRI data (362× 434× 362)
I simulated hemorrhagic stroke of ellipsoidal shapeI f = 1GHzI waveguides (ceramic) : εr = 59I matching liquid : εr = 44 + 20iI 10% multiplicative white Gaussian noise on synthetic data
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 15/ 26
Numerical experiment - hemorrhagic strokeIdea: reconstruct the permittivity slice by slice, by taking intoaccount the transmitting antennas corresponding to only one ringand truncating the computational domain.
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 16/ 26
Numerical experiment - hemorrhagic stroke
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 17/ 26
Numerical experiment - hemorrhagic strokeIdea: reconstruct the permittivity slice by slice, by taking intoaccount the transmitting antennas corresponding to only one ringand truncating the computational domain.=⇒ Reconstructed images corresponding to one ring obtained in lessthan 2 minutes.
64 128 256 512 1 0242 048
4 096
0.5
1
2
4
8
16
# of MPI processes
Tim
ein
min
utes
Linear speedup
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 18/ 26
Numerical experiment - hemorrhagic stroke
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 19/ 26
Block iterative methods and recyclingSubspace recyclingKeep information between restarts or when solving sequences oflinear systems:
Aixi = bi ∀i = 1, 2, . . .
Block methodsTreat multiple right-hand sides simultaneously for fasterconvergence:
AX = B B ∈ Cn×p
More about HPDDMI open-source, https://github.com/hpddm/hpddmI usable in C++, C, Python, or FortranI implementation of (pseudo-)Block GMRES/GCRO-DRI support for left/right/variable preconditioningI also has (pseudo-)Block CG and Breakdown-Free BCG
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 20/ 26
GCRO-DRGeneralized Conjugate Residual method with inner Orthogonalization and Deflated Restarting
Proposed by [Parks et al. 2007]
Closely related to GMRES-DR by [Morgan 2002]
Main idea1. end of GMRES cycle: compute Ritz eigenpairs2. next restart: use 1. to generate k vectors for Arnoldi basis3. perform extra orthogonalizations with k vectors
Overhead: I persistent storage between cycles/solvesI one additional synchronization per cycleI small dense (generalized) eigenvalue problem
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 21/ 26
Why use block methods?
Numerical aspectsenlarged Krylov subspace =⇒ faster convergence
PerformanceI higher arithmetic intensityI fewer synchronizations with more data
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 22/ 26
Implementation details
Block ArnoldiI block orthogonalizationI tall-and-skinny QR (default to CholQR V HV = LLH)
About CholQR: I V = QR with R = LH , Q = L−HVI BLAS 3I one reductionI can rank-reveal
Pseudo-block methodsp subspaces, computation and communication steps fused
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 23/ 26
Block methods for Medimax
Dissipative nature of brain tissues =⇒ the one-level preconditionerwith classical (pseudo-block) GMRES already works quite well.
=⇒ Try block methods for a harder test case:
I non-dissipative plastic cylinder (diameter 12cm) immersed in the imaging chamber andsurrounded by matching liquid.
I fine discretization with degree 2 edgeelements: 89 million unknowns.
I we solve the direct problem for 32transmitting antennas (second ring)=⇒ 32 RHSs
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 24/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomain
I alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomain
I alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomain
I alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomain
I alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomain
I alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomainI alternative #1 to #8 =⇒ 158× fewer iterations
I GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomainI alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRES
I working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Block methods for Medimaxalternative p solve # of it. per RHS eff.
GMRES 1 3,078.4 20,068 627 −GCRO-DR 1 1,836.9 10,701 334 1.7
pseudo-BGMRES 32 1,577.9 653 − 2.0BGMRES 32 724.8 158 − 4.2
pseudo-BGCRO-DR 8 1,357.8 1,508 377 2.3pseudo-BGCRO-DR 32 1,376.1 469 − 2.2
BGCRO-DR 8 677.6 524 131 4.5BGCRO-DR 32 992.3 127 − 3.1
I (m, k) = (50, 10) for solving 32 RHSsI 2,048 subdomains and 2 threads per subdomainI alternative #1 to #8 =⇒ 158× fewer iterationsI GCRO-DR always performs fewer iterations than GMRESI working on all 32 RHSs is costly (#5/#7 vs. #6/#8)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26
Conclusion and perspectives
Conclusion: This work shows the feasibility of a microwave imagingtechnique for stroke diagnosis and monitoring, using parallelcomputing.
Current work and perspectives:I experiment with subspace recycling techniques between iterations
during the optimization process when solving the inverse problemI choose a good coarse space for a two-level scalable preconditioner
for Maxwell’s equations (joint work with Ivan Graham and EuanSpence)
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 26/ 26
Conclusion and perspectives
Conclusion: This work shows the feasibility of a microwave imagingtechnique for stroke diagnosis and monitoring, using parallelcomputing.
Current work and perspectives:I experiment with subspace recycling techniques between iterations
during the optimization process when solving the inverse problemI choose a good coarse space for a two-level scalable preconditioner
for Maxwell’s equations (joint work with Ivan Graham and EuanSpence)
Thank you for your attention !
Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 26/ 26
top related