Top Banner

of 23

Chap 15 Slides

Apr 07, 2018

Download

Documents

tt186
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/3/2019 Chap 15 Slides

    1/23

    The Conjugate Gradient Method

    Tom Lyche

    University of Oslo

    Norway

    The Con u ate Gradient Method . 1/

    http://www.ifi.uio.no/http://www.ifi.uio.no/
  • 8/3/2019 Chap 15 Slides

    2/23

    Plan for the day

    The methodAlgorithm

    Implementation of test problems

    Complexity

    Derivation of the method

    Convergence

    The Con u ate Gradient Method . 2/

  • 8/3/2019 Chap 15 Slides

    3/23

    The Conjugate gradient method

    Restricted to positive definite systems: Ax = b,A Rn,n positive definite.Generate {xk} by xk+1 = xk + kpk,pk is a vector, the search direction,

    k is a scalar determining the step length.

    In general we find the exact solution in at most niterations.

    For many problems the error becomes small after a fewiterations.

    Both a direct method and an iterative method.

    Rate of convergence depends on the square root of thecondition number

    The Con u ate Gradient Method . 3/

  • 8/3/2019 Chap 15 Slides

    4/23

    The name of the game

    Conjugate means orthogonal; orthogonal gradients.But why gradients?

    Consider minimizing the quadratic function Q : Rn

    R

    given by Q(x) := 12xTAx xTb.The minimum is obtained by setting the gradient equalto zero.

    Q(x) = Ax b = 0 linear system Ax = bFind the solution by solving r = bAx = 0.

    The sequence {xk} is such that {rk} := {bAxk} isorthogonal with respect to the usual inner product in Rn.

    The search directions are also orthogonal, but with

    respect to a different inner product.

    The Con u ate Gradient Method . 4/

  • 8/3/2019 Chap 15 Slides

    5/23

    The algorithm

    Start with some x0. Set p0 = r0 = bAx0.For k = 0, 1, 2, . . .

    xk+1

    = xk

    + kp

    k,

    k= r

    Tk rk

    pTkApk

    rk+1 = bAxk+1 = rk kApkpk+1 = rk+1 + kpk, k =

    rTk+1rk+1

    rT

    k rk

    The Con u ate Gradient Method . 5/

  • 8/3/2019 Chap 15 Slides

    6/23

    Example

    2 11 2

    [ x1x2 ] = [ 10 ]

    Start with x0 = 0.

    p0

    = r0 = b = [1, 0]T

    0 =rT0 r0

    pT0Ap0= 12 , x1 = x0 + 0p0 = [

    00 ] +

    12 [

    10 ] =

    1/2

    0

    r1 = r0 0Ap0 = [10 ]

    1

    2 21

    = 0

    1/2, r

    T

    1 r0 = 0

    0 =rT1 r1rT0 r0

    = 14 , p1 = r1 + 0p0 =

    01/2

    + 14 [

    10 ] =

    1/41/2

    ,

    1 = rT1 r1

    pT1Ap1= 23 ,

    x2 = x1 + 1p1 =

    1/20

    + 23

    1/41/2

    =

    2/31/3

    r2 = 0, exact solution.

    The Con u ate Gradient Method . 6/

  • 8/3/2019 Chap 15 Slides

    7/23

    Exact method and iterative method

    Orthogonality of the residuals implies that xm is equal to the solutionx of Ax = b for some m n.For if xk = x for all k = 0, 1, . . . , n 1 then rk = 0 for

    k = 0, 1, . . . , n 1 is an orthogonal basis forRn

    . But then rn Rn

    isorthogonal to all vectors in Rn so rn = 0 and hence xn = x.

    So the conjugate gradient method finds the exact solution in at most

    n iterations.

    The convergence analysis shows that x xkA typically becomessmall quite rapidly and we can stop the iteration with k much smaller

    that n.

    It is this rapid convergence which makes the method interesting and

    in practice an iterative method.

    The Con u ate Gradient Method . 7/

  • 8/3/2019 Chap 15 Slides

    8/23

    Conjugate Gradient Algorithm

    [Conjugate Gradient Iteration] The positive definite linear system Ax = b is

    solved by the conjugate gradient method. x is a starting vector for the iteration. The

    iteration is stopped when ||rk||2/||r0||2 tol or k > itmax. itm is the number ofiterations used.

    function [ x , i tm ]= cg (A, b , x , t o l , i tmax ) r=bAx ; p=r ; rho=r r ;rho0=rho ; f o r k=0: i tmax

    i f s q r t ( rho / rho0)

  • 8/3/2019 Chap 15 Slides

    9/23

    A family of test problems

    We can test the methods on the Kronecker sum matrix

    A = C1I+IC2 =

    C1

    C1

    . . .

    C1

    C1

    +

    cI bI

    bI cI bI

    . . . . . . . . .

    bI cI bI

    bI cI

    ,

    where C1 = tridiagm(a,c,a) and C2 = tridiagm(b,c,b).

    Positive definite if c > 0 and c |a| + |b|.

    The Con u ate Gradient Method . 9/

  • 8/3/2019 Chap 15 Slides

    10/23

    m = 3, n = 9

    A =

    2c a 0 b 0 0 0 0 0

    a 2c a 0 b 0 0 0 0

    0 a 2c 0 0 b 0 0 0

    b 0 0 2c a 0 b 0 0

    0 b 0 a 2c a 0 b 0

    0 0 b 0 a 2c 0 0 b

    0 0 0 b 0 0 2c a 0

    0 0 0 0 b 0 a 2c a

    0 0 0 0 0 b 0 a 2c

    b = a = 1, c = 2: Poisson matrixb = a = 1/9, c = 5/18: Averaging matrix

    The Con u ate Gradient Method . 10/

  • 8/3/2019 Chap 15 Slides

    11/23

    Averaging problem

    jk = 2c + 2a cos(jh) + 2b cos(kh), j ,k = 1, 2, . . . , m .a = b = 1/9, c = 5/18

    max =5

    9

    + 4

    9

    cos(h), min =5

    9 4

    9

    cos(h)

    cond2(A) =maxmin

    = 5+4 cos(h)54 cos(h) 9.

    The Con u ate Gradient Method . 11/

  • 8/3/2019 Chap 15 Slides

    12/23

    2D formulation for test problems

    V= vec(x). R= vec(r), P = vec(p)Ax = b DV+ V E= h2F,D = tridiag(a,c,a)

    Rm,m, E= tridiag(b,c,b)

    Rm,m

    vec(Ap) = DP+ PE

    The Con u ate Gradient Method . 12/

  • 8/3/2019 Chap 15 Slides

    13/23

    Testing

    [Testing Conjugate Gradient ] A = trid(a,c,a,m) Im + Im trid(b,c,b,m) Rm2,m2

    function [V , i t ]= cg te s t (m, a , b , c , t o l , i tmax )

    h=1/(m+1); R=hhones(m);D=sparse ( t r i d i a g o n a l ( a , c , a ,m) ) ; E=sparse ( t r i d i a g o n a l ( b , c , b ,m) ) ;

    V=zeros (m,m) ; P=R; rho=sum(sum(R.R) ) ; rho0=rho ;f o r k=1: i tmax

    i f s q r t ( rho / rho0)

  • 8/3/2019 Chap 15 Slides

    14/23

    The Averaging Problem

    n 2 500 10 000 40 000 1 000 000 4 000 000

    K 22 22 21 21 20

    Table 1: The number of iterations K for the averag-

    ing problem on a

    n n grid. x0 = 0 tol = 108

    Both the condition number and the required number of iterations are

    independent of the size of the problem

    The convergence is quite rapid.

    The Con u ate Gradient Method . 14/

  • 8/3/2019 Chap 15 Slides

    15/23

    Poisson Problem

    jk = 2c + 2a cos(jh) + 2b cos(kh), j ,k = 1, 2, . . . , m .a = b = 1, c = 2max = 4 + 4 cos (h), min = 4

    4cos(h)

    cond2(A) =maxmin

    = 1+cos(h)1cos(h) = cond(T)2.

    cond2(A) = O(n).

    The Con u ate Gradient Method . 15/

    Th P i bl

  • 8/3/2019 Chap 15 Slides

    16/23

    The Poisson problem

    n 2 500 10 000 40 000 160 000

    K 140 294 587 1168

    K/

    n 1.86 1.87 1.86 1.85

    Using CG in the form of Algorithm 8 with = 108 and x0 = 0 we list

    K, the required number of iterations and K/

    n.

    The results show that K is much smaller than n and appears to be

    proportional to

    n

    This is the same speed as for SOR and we dont have to estimateany acceleration parameter!

    n is essentially the square root of the condition number of A.

    The Con u ate Gradient Method . 16/

    C l it

    http://-/?-http://-/?-
  • 8/3/2019 Chap 15 Slides

    17/23

    Complexity

    The work involved in each iteration is1. one matrix times vector (t = Ap),

    2. two inner products (pTt and rTr),

    3. three vector-plus-scalar-times-vector (x = x + ap,r = r at and p = r + (rho/rhos)p),

    The dominating part of the computation is statement 1.Note that for our test problems A only has O(5n) nonzeroelements. Therefore, taking advantage of the sparseness ofA we can compute t in O(n) flops. With such an

    implementation the total number of flops in one iteration isO(n).

    The Con u ate Gradient Method . 17/

    M C l it

  • 8/3/2019 Chap 15 Slides

    18/23

    More Complexity

    How many flops do we need to solve the test problemsby the conjugate gradient method to within a giventolerance?

    Average problem. O(n) flops. Optimal for a problemwith n unknowns.

    Same as SOR and better than the fast method based

    on FFT.Discrete Poisson problem: O(n3/2) flops.

    same as SOR and fast method.

    Cholesky Algorithm: O(n2) flops both for averaging andPoisson.

    The Con u ate Gradient Method . 18/

    A l i d D i ti f th M th d

  • 8/3/2019 Chap 15 Slides

    19/23

    Analysis and Derivation of the Method

    Theorem 3 (Orthogonal Projection). LetSbe a subspace of a finitedimensional real or complex inner product space(V,F, , , ). To eachx Vthere is a unique vectorp Ssuch that

    xp, s = 0, for alls S. (1)

    x

    x

    x - p

    p=P

    S S

    The Con u ate Gradient Method . 19/

    B t A i ti

  • 8/3/2019 Chap 15 Slides

    20/23

    Best Approximation

    Theorem 4 (Best Approximation). LetSbe a subspace of a finitedimensional real or complex inner product space(V,F, , , ). Letx V, andp S. The following statements are equivalent

    1.x

    p

    ,s = 0,

    for allsS.

    2. x s > xp for alls Swiths = p.

    If(v

    1, . . . ,vk)

    is an orthogonal basis for S then

    p =k

    i=1

    x,vi

    vi,vi

    vi. (2)

    The Con u ate Gradient Method . 20/

    D i ti f CG

  • 8/3/2019 Chap 15 Slides

    21/23

    Derivation of CG

    Ax = b, A Rn,n

    is pos. def., x, b Rn

    (x,y) := xTy, x,y Rn

    x,y

    := xTAy = (x,Ay) = (Ax,y)

    xA = xTAxW0 = {0}, W1 = span{b}, W2 = span{b,Ab},Wk = span{b,Ab,A

    2

    b, . . . ,Ak1

    b}W0 W1 W2 Wk dim(Wk)

    k, w

    Wk

    Aw

    Wk+1

    xk Wk, xk x,w = 0 for all w Wkp0 = r0 := b, pj = rj

    j1i=0

    rj ,pipi,pi

    pi, j = 1, . . . , k .

    The Con u ate Gradient Method . 21/

    Convergence

  • 8/3/2019 Chap 15 Slides

    22/23

    Convergence

    Theorem 5. Suppose we apply the conjugate gradient method to apositive definite systemAx = b. Then theA-norms of the errors satisfy

    ||x

    xk

    ||A

    ||x x0||A 2

    1

    + 1k

    , for k 0,where = cond2(A) = max/min is the 2-norm condition number of

    A.This theorem explains what we observed in the previoussection. Namely that the number of iterations is linked to

    , the square root of the condition number of A. Indeed,

    the following corollary gives an upper bound for the numberof iterations in terms of

    .

    The Con u ate Gradient Method . 22/

  • 8/3/2019 Chap 15 Slides

    23/23

    Corollary 6. If for some > 0 we havek 12 ln(

    2 ) then||x xk||A/||x x0||A .

    The Con u ate Gradient Method . 23/