Top Banner
On the Design of Nonconforming High-Resolution Finite Element Schemes for Transport Problems Matthias Möller Institute of Applied Mathematics (LS3) TU Dortmund, Germany Modeling and Simulation of Transport Phenomena Moselle Valley, Germany July 30 - August 1, 2012
37

On the Design of Nonconforming High-Resolution Finite ... · July 30 - August 1, 2012. Motivation Objective: apply AFC schemes to nonconforming finite elements Do the algebraic design

Oct 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • On the Design of NonconformingHigh-Resolution Finite Element

    Schemes for Transport ProblemsMatthias Möller

    Institute of Applied Mathematics (LS3)TU Dortmund, Germany

    Modeling and Simulation of Transport PhenomenaMoselle Valley, GermanyJuly 30 - August 1, 2012

  • Motivation

    Objective: apply AFC schemes to nonconforming finite elements

    ■ Do the algebraic design criteria apply to nonconforming elements?■ Is the accuracy of solutions comparable to P1/Q1 approximations?■ How to implement essential boundary conditions?■ Is there any benefit from using nonconforming elements?

    Algebraic Flux Correction, abbr. AFC, family of high-resolution schemesfor convection-dominated transport and anisotropic diffusion problems.

    ■ universal stabilization approach based on algebraic design criteria■ approved for conforming (multi-)linear finite element schemes■ found complicated to extend to higher-order finite elements

  • kij = �vj ·Z

    ⌦'ir'j dx, sij = �d

    Z

    ⌦r'i ·r'j dx

    Model problem

    ■ Convection-diffusion equation

    ■ Fletcher‘s group representation

    ■ Semi-discrete high-order scheme

    u(x, t) ⇡X

    j'j(x)uj(t), f(u) ⇡

    Xj'j(x)f(uj)

    Z

    ⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

    Xjmij u̇j =

    Xjkijuj +

    Xjsijuj

  • kij = �vj ·Z

    ⌦'ir'j dx,

    Model problem

    ■ Convection-diffusion equation

    ■ Fletcher‘s group representation

    ■ Semi-discrete high-order scheme

    u(x, t) ⇡X

    j'j(x)uj(t), f(u) ⇡

    Xj'j(x)f(uj)

    Z

    ⌦w [u̇+r · f(u)] dx = 0, f(u) = vu� dru

    MCu̇ = KuXjmij u̇j =

    Xj 6=i

    kij(uj � ui) + �iui

    �i =X

    jkij

  • Family of AFC schemes Kuzmin et al.

    MLu̇ = [K+D]u+ F(u̇,u)

    artificial diffusion operator

    antidiffusive correctionlumped mass matrix

  • linearized flux-correction

    nonlinear flux-correction

    Family of AFC schemes Kuzmin et al.

    NL-GP NL-FCT NL-TVD Low-order

    MLu = MLuL + F̄ (u̇L, uL)Lin-FCT

    F (u) = �DuF (u̇, u) = [ML �MC ]u̇�Du F ⌘ 0

    MLu̇ = [K+D]u+ F(u̇,u)

    AFC-LPT. . .

  • Review of algebraic design principles

    ■ Jameson‘s Local Extremum Diminishing criterionIF

    THEN local solution maxima/minima do not increase/decrease

    ■ Semi-discrete high-resolution AFC scheme

    positive not negative

    miu̇i =X

    j 6=i�ij(uj � ui)

    miu̇i =X

    j 6=i(kij + dij)(uj � ui) + �iui +

    X

    j 6=i↵ijfij

    not negative by construction

    controlled byflux limiter

    mi =X

    jmij

    need to check the positivity of mass matrix coefficients for each finite element by hand!

  • Finite element spaces

    ■ Parametric quadrilateral finite elements

    ■ Bilinear one-to-one mappingQ(T ) = {q = q̂ � �1T , q̂ 2 Q̂(T̂ )}

    T : T̂ := [�1, 1]⇥ [�1, 1] 7! T 2 Th

    Q̂1(T̂ ) = spanh1, x̂, ŷ, x̂ŷi

    Rannacher& Turek ‘92

    Qrot1,npar(T ) = spanh1, ⇠, ⌘, ⇠2 � ⌘2i

    rot

    1

    (T̂ ) = spanh1, x̂, ŷ, x̂2 � ŷ2i

    Q1(T )

    Qrot1

    (T )

  • Nonconforming shape functions

    ■ Midpoint based variant

    '̂1(x̂) =1

    4� 1

    2ŷ � �(x̂2 � ŷ2), '̂3(x̂) =

    1

    4+

    1

    2ŷ � �(x̂2 � ŷ2)

    '̂2(x̂) =1

    4+

    1

    2x̂+ �(x̂2 � ŷ2), '̂4(x̂) =

    1

    4� 1

    2x̂+ �(x̂2 � ŷ2)

    ■ Mean value based variant� =

    1

    4: '̂i(m̂j) = �ij � =

    3

    8: |�̂j |�1

    Z

    �̂j

    '̂i(x̂) d� = �ij

  • Matrix analysis - part 1

    ■ Consistent mass matrix on reference element

    ■ Positivity criterion

    ■ midpoint based variant has negative matrix coefficients■ mean value based variant has positive matrix coefficients■ lower bound is invariant to shape of quadrilateral element

    M̂ =

    0

    BBBB@

    1645�

    2 + 724 �1645�

    2 + 181645�

    2 � 124 �1624�

    2 + 18� 1624�

    2 + 181645�

    2 + 724 �1645�

    2 + 181645�

    2 � 1241645�

    2 � 124 �1624�

    2 + 181645�

    2 + 724 �1645�

    2 + 18� 1645�

    2 + 181645�

    2 � 124 �1624�

    2 + 181645�

    2 + 724

    1

    CCCCA

    0.3423 ⇡ 116

    p30 < |�| < 3

    16

    p10 ⇡ 0.5929

    (� = 14 )

    (� = 38 )

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

    0 10 20 30 40 50 60 70 80

    0

    10

    20

    30

    40

    50

    60

    70

    80

    nz = 530

    Q1rot FE

  • Matrix analysis - part 2

    0 10 20 30 40 50

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    nz = 369

    Q1 FE

    0 10 20 30 40 50 60 70 80

    0

    10

    20

    30

    40

    50

    60

    70

    80

    nz = 530

    Q1rot FE

    3456789

    101112

    0 10 20 30 40 50 60 70 80

    Number of non-zero entries per matrix row

    Number of matrix row

    Q1Q1rot

    Storage format:■ CRS, CCS

    Storage format:■ ELLPACK

  • Solid body rotation

    ■ Velocity field

    ■ Grid size

    ■ Stochastic grid disturbance

    ■ Time step in Crank-Nicolson

    ■ Initial = exact solution at

    v(x, y) = (0.5� y, x� 0.5)

    �t = 1.28 · h

    t = 2⇡k, k 2 N

    � 2 {0%, 1%, 5%}

    h = 1/2l, l = 5, 6, . . .

    u̇+r · (vu) = 0 in (0, 1)2

    u = 0 on �inflow

  • SBR: low-order solutions (0% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

    54%

    63% 64%

    64%

  • SBR: low-order solutions (5% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

    54%

    64%

    64%

    64%

  • SBR: NL-FCT solutions (0% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

  • SBR: NL-FCT solutions (5% mesh perturbation)

    Q1

    Qrot,MVal1,npar

    Qrot,MVal1

    Qrot,MPt1

  • SBR: L2-error (5% mesh perturbation)

    0,01

    0,1

    1

    5 6 7 8

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    0,01

    0,1

    1

    5 6 7 8

    nonconforming FE

    Refinement level

  • Rotation of a Gaussian hill

    ~Q1 FE NL-FCTtime t=5/2π

    u̇+r · (vu� dru) = 0 in (�1, 1)2, v(x, y) = (�y, x), d = 0.001

  • RGH: L2-error (5% mesh perturbation)

    0,001

    0,01

    0,1

    1

    10

    5 6 7 8 9

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    0,001

    0,01

    0,1

    1

    10

    5 6 7 8 9

    nonconforming FE

    Refinement level

  • RGH: dispersion-error (5% mesh perturbation)

    -0,5

    0,5

    1,5

    2,5

    3,5

    4,5

    5,5

    5 6 7 8 9

    conforming FE

    Refinement level

    Low-order Lin-FCT NL-FCT NL-GP NL-TVD

    -0,5

    0,5

    1,5

    2,5

    3,5

    4,5

    5,5

    5 6 7 8 9

    nonconforming FE

    Refinement level

  • conforming FE nonconforming FE

    good accuracy good

    small numerical diffusion small(er)

    smaller #DOFs, #edges larger

    irregular sparsity pattern regular

    Taxonomy of finite elements

  • Boundary conditions

    ■ Y. Basilevs, T. Hughes, Weak imposition of Dirichlet boundary conditionsin fluid mechanics, Computers & Fluids 32 (1) 2007, 12-26

    ■ consistent, adjoint-consistent■ consistent, adjoint-inconsistent

    ■ E. Burman, A penalty free non-symmetric Nitsche type method for the weak imposition of boundary conditions, eprint arXiv:1106.5612v2 (Nov 2011)

    Convection-diffusion equation hyperbolic limit

    r · (vu� dru) = f in ⌦u = uD on �D

    (dru) · n = g on �N

    r · (vu) = f in ⌦(vu) · n = h on �in

    d ! 0

    � = 1

    � = �1�b =

    Cd

    hb

    � = �1, � ⌘ 0

  • Weak imposition of boundary conditions

    Z

    ⌦�rwh · (vuh � druh)dx+

    Z

    �wh(vuh) · nds

    �Z

    �D

    wh(druh) · nds�Z

    �D

    (�drwh) · nuhds

    �Z

    �D\�in(vwh) · nuhds+

    NebX

    b=1

    Z

    �D\�b�bwhuhds

    =

    Z

    ⌦whfdx+

    Z

    �N

    whgds�Z

    �D

    �(drwh) · nuDds

    �Z

    �D\�in(vwh) · nuDds+

    NebX

    b=1

    Z

    �D\�b�bwhuDds

    � = ±1, �b =Cd

    hb

  • Weak imposition of boundary conditions

    Z

    ⌦�rwh · (vuh � druh)dx+

    Z

    �wh(vuh) · nds

    �Z

    �D\�in(vwh) · nuhds

    =

    Z

    ⌦whfdx

    �Z

    �D\�in(vwh) · nuDds

    Z

    �out

    whv · nuhds

  • � = 1, �b =4 · 0.001

    hbFEM-TVDQrot

    1

  • � = �1, �b =4 · 0.001

    hbFEM-TVDQrot

    1

  • � = �1, �b ⌘ 0FEM-TVDQrot1

  • 1D convection-diffusion equation

    � = 1

    �b =4 · 0.001

    h

    � = �1

    �b =4 · 0.001

    h

    � = �1�b ⌘ 0

    1ux

    � 0.001uxx

    = 0

    u(0) = 1, u(1) = 0

    FEM-TVDQ1

  • Summary

    ■ Do the algebraic design criteria apply to nonconforming elements? Yes, for the integral mean value based variant of nonconforming rotated bilinear finite elements.

    ■ Is the accuracy of solutions comparable to P1/Q1 approximations? Yes, accuracy and numerical dissipation are comparable.

    ■ How to implement essential boundary conditions? Apply consistent, adjoint-consistent Nitsche-type method.

    ■ Is there any benefit from using nonconforming elements? Regular sparsity structure beneficial for parallel HPC.

  • Outlook

    ■ Extend AFC schemes to other nonconforming finite elements■ NC-Quad (Z. Cai, J. Douglas, Jr., X. Ye)■ Composite Crouzeix-Raviart (F. Schieweck)

    ■ Generalize consistent, adjoint-consistent approach towards the implementation of Dirichlet boundary conditions to periodic ones

    ■ Improve (parallel) efficiency by exploring the regular sparsity structure■ use variant of ELLPACK matrix storage format■ reduce communication costs due to weaker coupling

  • Acknowledgment

    AFC schemes

    ■ D. Kuzmin (University of Erlangen-Nuremberg)Featflow2

    ■ M. Köster, P. Zajac (TU Dortmund)

    Source code freely available at:

    http://www.featflow.de/en/software/featflow2.html

    http://www.featflow.de/en/software/featflow2.htmlhttp://www.featflow.de/en/software/featflow2.html

  • Edge-based solvers for the compressible Euler equations on multicores and GPUs

  • Example: edge-based flux-assembly with Q1 FE

    0

    10

    20

    30

    40

    50

    0K 1K 10K 100K 1.000K

    Band

    wid

    th (

    GB/

    s)

    #edges per color group

    1 OMP 2 OMP 4 OMP8 OMP 12 OMP Xeon X5680(2x6) CUDA C2070

    best GPU performancefor large problem sizes

    best CPU performanceif data fits into cache

  • Example: edge-based flux assembly with Q1 FE

    0

    25

    50

    75

    100

    125

    150

    CUDA OMP 6

    Acc

    umul

    ated

    tim

    e (m

    s)

    w/ transf.„55 ms“

    132 ms

    23 ms

    5,7x

    0

    10

    20

    30

    40

    50

    0K

    125K

    250K

    375K

    500K

    Band

    wid

    th (

    GB/

    s)

    Color groups 1-14

    OMP 6 on Core i7 EPCUDA on C2070#edges per color group

    6-7 GB/s

    Fi =N

    colorsX

    k=1

    X

    ij2CGk

    �ij(Uj � Ui)

  • 0

    25

    50

    75

    100

    125

    150

    175

    200

    CUDA OMP 6

    Acc

    umul

    ated

    tim

    e (m

    s)

    w/ transf.„95 ms“

    191 ms

    32 ms

    5,9x

    Example: edge-based flux assembly with Q1rot FE

    0

    10

    20

    30

    40

    50

    60

    0K

    375K

    750K

    1.125K

    1.500K

    Band

    wid

    th (

    GB/

    s)

    Color groups 1-9

    OMP 6 on Core i7 EPCUDA on C2070#edges per color group

    6-7 GB/s