Process Model Formulation and Solution, 3E4 Section C: Nonlinear Algebraic Equations Instructor: Kevin Dunn [email protected] Department of Chemical Engineering Course notes: Dr. Benoˆ ıt Chachuat 17 October 2010 1
Sep 24, 2020
Process Model Formulation and Solution, 3E4Section C: Nonlinear Algebraic Equations
Instructor: Kevin Dunn [email protected]
Department of Chemical Engineering
Course notes: © Dr. Benoıt Chachuat17 October 2010
1
Why solve nonlinear algebraic equations?
Consider the modelling of a jacketed CSTR, fed with a single inlet stream.
Assumptions:
I A1: Perfect mixing
I A2: Equal inflow and outflow (volumetric)
I A3: Constant physical properties
I A4: Single second-order reaction
I A5: Neglected shaft work & non-idealities
Modelling goals:
I Predict the concentration of reactantA and reactor temperature atsteady-state
F in
V [C inA − CA]− 2 k0 C 2
A exp(− Ea
RT ) = 0
F in
V [T in − T ]− UAs
ρ cp V [T − Tj ] + k0 (−∆Hr )ρ cp
C 2A exp(− Ea
RT ) = 0
The signs take care of the energy direction (in or out).2
Linear vs. nonlinear algebraic equations
Linear equations:a11x1 + · · ·+ a1nxn = b1
...an1x1 + · · ·+ annxn = bn
I Same number of variables andequations
I Constant aij ’s and bj ’s
I Elimination or iterative methods
Nonlinear equations:f1(x1, . . . , xn) = 0
...fn(x1, . . . , xn) = 0
I Same number of variables andequations
I General functions f1, . . . , fnI Iterative methods only
Find (x1, x2) such that:{sin(x1) = x2
exp(x1) + x2 = 0
Do any solutions exist?If so, which are they?
3
Linear vs. nonlinear algebraic equations (cont’d)
Case of scalar nonlinear equations:
Given a function f : IR → IR, find x such that f (x) = 0
Examples:
I Find x such that: exp(x) + sin(x)− 2 = 0
I Find z such that: z exp(z) = 3
Why are nonlinear equations more difficulty to handle?
I Possibility of multiple isolated solutions
I E.g., the scalar equation sin(x) = 1I All solution are equally good!
I Possibility of no solutions at allI Even for scalar equations!I E.g., the scalar equation sin(x) = 2
I Lack of criteria of existence and uniqueness of the solutionsI Try to find one solution, even though none may exist!
4
Outline and recommended readings
Motivations
Scalar nonlinear algebraic equationsThe bisection methodThe fixed-point iteration methodThe Newton-Raphson method and its variants
Systems of nonlinear algebraic equationsThe multivariable Newton-Raphson method
Recommended readings:
I Chapters 5.1-5.2, and 6 in: S. C. Chapra, andR. P. Canale, “Numerical Methods forEngineers”, McGraw Hill, 5th/6th Edition
5
Bracketing methods
Idea:
Use the fact that the function f changes itssign in the vicinity of an (isolated) solutionx∗ of f (x) = 0
Procedure:
I Start with initial points such that: x` < x∗ < xu and f (x`) f (xu) < 0
I Assume f (x) is continuous inside the chosen interval
I Systematically reduce the width of the bracket:
6
The bisection method
How to choose the intermediate point xm?
I Simply consider the mid-point of the current interval!
xm =x` + xu
2
When to stop this iterative refinement procedure?
I A possible stopping criterion is
ε(k)rel
∆=
∣∣∣∣∣x (k)m − x
(k−1)m
x(k)m
∣∣∣∣∣ =
∣∣∣∣∣x (k)u − x
(k)`
x(k)u + x
(k)`
∣∣∣∣∣ ≤ εtol
with εtol the user-defined tolerance; other criteria?
I Yields an upper-bound on the actual relative error,∣∣∣ x true−x (k)
mx true
∣∣∣I Bracket width after k iterations:
x (k)u − x
(k)` =
x(0)u − x
(0)`
2k
7
The bisection method: algorithmI Step 0: Initialization
I Choose points x(0)` , x
(0)u bracketing x∗ and stopping tolerance εtol > 0
I Set counter k ← 0I Evaluate f (x
(0)` ) and f (x
(0)u ); if f (x
(0)` ) f (x
(0)u ) ≥ 0, STOP
I Step 1: Calculate mid-point x(k)m
I Set x(k)m = 1
2
hx
(k)u + x
(k)`
iand evaluate f (x
(k)m )
I Step 2a: Case f (x(k)` ) f (x
(k)m ) < 0 (root to the left of x
(k)m )
I Narrow the search by eliminating the rightmost part:
x(k+1)` = x
(k)` , x (k+1)
u = x (k)m
I Step 2b: Case f (x(k)m ) f (x
(k)u ) < 0 (root to the right of x
(k)m )
I Narrow the search by eliminating the leftmost part:
x(k+1)` = x (k)
m , x (k+1)u = x (k)
u
I Step 3: Stopping
I If ε(k)rel =
˛x(k)u −x
(k)`
x(k)u +x
(k)`
˛=
˛x(k)m −x
(k−1)m
x(k)m
˛≤ εtol, STOP; report xm as the
approximate rootI Increment counter k ← k + 1, and return to step 1
8
The bisection method: practical considerations
Use the bisection method to solve the nonlinear equation
f (x) = exp(x) + x − 2 = 0, starting from [x(0)` , x
(0)u ] = [0, 1]
it. x` xu xm f (x`) f (xu) f (xm) εrel0 0.000 1.000 0.500 -1.000 1.718 0.149 –1 0.000 0.500 0.250 -1.000 0.149 -0.466 1.0002 0.250 0.500 0.375 -0.466 0.149 -0.170 0.3333 0.375 0.500 0.437 -0.170 0.149 -0.01 0.1434 0.437 0.500 0.469 -0.014 0.149 0.067 0.067
Pros:I The function f must be continuous, but it may be nonsmoothI Calculations are straightforward (only function evaluations)I Code is easy to implementI The method always converges; used as a “starter” for other methods
Cons:I Assumes an a priori enclosure for the root, x∗ ∈ [x
(0)` , x
(0)u ]
I Bisection search is very slow! Considerable computation may beneeded to get an accurate solution
I Many (sometimes expensive) function evaluations are required.9
Open methods
Principle:I Start from an initial guess of the solution, and iteratively refine the
solution estimate
I Convergence typically (much) faster than with bracketing methods
I Can be extended to systems of nonlinear equations
I But, divergence possible (depending on the initial guess)
10
Fixed-point iteration methods
Assumption:
Many nonlinear equations are naturally formulated as fixed-pointproblems
x = g(x)where g may be a nonlinear function
Definition: A fixed point of a function g is any real number, x , for whichg(x) = x ; i.e. a number whose location is fixed by the function g(x).
I In general f (x) = 0, can always be defined as g(x) = x ± f (x)
I Example: f (x) = 3x2 − x − 2 = 0: written as g(x) = 3x2 − 2 = x
I Example: f (x) = 3x2 − 4x − 2 = 0: written as g(x) = (3x2 − 2)/4 = x
I Example: f (x) = exp(x)− 2 = 0: written as g(x) = exp(x)− 2 + x = x
Fixed-point algorithm:
x (k+1) = g(x (k)); x (0) given; until
∣∣∣∣x (k+1) − x (k)
x (k+1)
∣∣∣∣ < εtol
Solution to f (x) = 0 is at x (k+1)
I a.k.a. the method of successive substitution11
Fixed-point iteration methods: application
Apply the fixed-point iteration method, starting from x (0) = 0.2, to solve:
f (x) =exp(x)
2+ x − 1 = 0, and f (x) = exp(x) + x − 2 = 0
it. x g(x) εrel0 0.200 0.389 --
1 0.389 0.262 0.486
2 0.262 0.350 0.486
3 0.350 0.290 0.252
4 0.290 0.332 0.206
5 0.332 0.303 0.124
6 0.303 0.323 0.093
7 0.323 0.310 0.060
8 0.310 0.319 0.043
9 0.319 0.312 0.018
10 0.312 0.317 0.020
it. x g(x) εrel0 0.200 0.779 --
1 0.779 -0.178 0.743
2 -0.178 1.163 5.364
3 1.163 -1.121 1.153
4 -1.121 1.699 1.969
5 1.699 -3.469 1.707
6 -3.469 1.969 1.490
7 1.969 -5.162 2.762
8 -5.162 1.994 1.381
9 1.994 -5.347 3.589
10 -5.347 1.995 1.373
12
Fixed-point iteration methods: analysis
Geometrical interpretation:
Convergence analysis:
I Sufficient condition: |g ′(x)| < 1, for all x
I Necessary condition: |g ′(x∗)| < 1
I Linear rate of convergence: limk→∞
|x (k+1) − x∗||x (k) − x∗|
= |g ′(x∗)|
13
The Newton-Raphson method
History and facts:
I Discovered independently by Joseph Raphson and Isaac Newton atabout the same time
I First published by Raphson in his book Analysis AequationumUniversalis (1690)
I Perhaps the most widely used of all root-locating formulas
Idea:
I Why? Improve the efficiency of nonlinear equation solving
I How? Incorporate information on the function first derivative
Problem:
Suppose we have reached a point x (k). How can we choose the newpoint, x (k+1) = x (k) + ∆x , by using information on f (x (k)) and f ′(x (k))only?
14
The Newton-Raphson method (cont’d)
I Consider the problem to find x such that f (x) = 0I f : IR → IR is continuous, and so are its 1st and 2nd derivatives f ′, f ′′
I At a given iterate x (k), by Taylor theorem:f (x (k) + ∆x) = f (x (k)) + f ′(x (k))∆x + O([∆x ]2)
I For a linear function f , we have O([∆x ]2) = 0, sof (x (k) + ∆x) = f (x (k)) + f ′(x (k))∆x .
I We would like to find the linear function’s zero in 1 step: i.e.f (x (k+1)) = f (x (k) + ∆x) = 0. What step size should we take?
I ∆x = − f (x (k))
f ′(x (k))⇒ f (x (k+1)) = f (x (k) + ∆x) = 0
I In the nonlinear case, f (x (k) + ∆x) ≈ f (x (k)) + f ′(x (k))∆x provided∆x is small, and O([∆x ]2) terms are small
I
Choose ∆x = − f (x (k))
f ′(x (k))
in the hope that f (x (k) + ∆x) ≈ 0 or, at least,‖f (x (k) + ∆x)‖ < ‖f (x (k))‖
15
The Newton-Raphson method (cont’d)
Graphical depiction of the Newton-Raphson method:
Newton-Raphson formula: x (k+1) = x (k) − f (x (k))
f ′(x (k))
16
The Newton-Raphson method: algorithm
I Step 0: InitializationI Choose initial point x (0), stopping tolerance εtol > 0, and maximum
number of iterations NI Set counter k ← 0
I Step 1: Stopping
I If˛f (x (k))
˛≤ εtol and/or
˛x(k)−x(k−1)
x(k)
˛≤ εtol or k > N, STOP; report
x (k) as an approximate solution
I Step 2: Function and derivative evaluationI Compute function value f (x (k)) and first-order derivative f ′(x (k))
I Step 3: Newton-Raphson step
I Compute the Newton-Raphson step: ∆x = − f (x (k))
f ′(x (k))I Update iterate: x (k+1) = x (k) + ∆xI Increment counter k ← k + 1, and return to step 1
17
The Newton-Raphson method: application
Apply the Newton-Raphson method to solve the nonlinear equationf (x) = exp(x) + x − 2 = 0, starting from x (0) = 0, x (0) = 5 and x (0) = 10
it. x f (x) ∆x
0 0.000 -1.00e+00 5.00e-01
1 0.500 1.49e-01 -5.61e-02
2 0.444 2.55e-03 -9.97e-04
3 0.443 7.74e-07 -3.03e-07
it. x f (x) ∆x
0 5.000 1.51e+02 -1.01e+00
1 3.987 5.59e+01 -1.02e+00
2 2.969 2.04e+01 -9.98e-01
3 1.970 7.14e+00 -8.74e-01
4 1.096 2.09e+00 -5.23e-01
5 0.573 3.47e-01 -1.25e-01
6 0.448 1.33e-02 -5.18e-03
7 0.443 2.10e-05 -8.20e-06
it. x f (x) ∆x
0 10.00 2.20e+04 -1.00e+00
1 9.000 8.11e+03 -1.00e+00
2 7.999 2.98e+03 -1.00e+00
3 6.997 1.10e+03 -1.00e+00
4 5.994 4.05e+02 -1.01e+00
5 4.986 1.49e+02 -1.01e+00
6 3.973 5.51e+01 -1.02e+00
7 2.955 2.02e+01 -9.98e-01
8 1.957 7.03e+00 -8.71e-01
9 1.086 2.05e+00 -5.17e-01
10 0.569 3.36e-01 -1.21e-01
11 0.448 1.25e-02 -4.87e-03
12 0.443 1.85e-05 -7.25e-06
18
The Newton-Raphson method: analysis
Local convergence analysis:I In the neighbourhood of a root x∗ to the equation f (x) = 0,
δx (k+1) =1
2
f ′′(x∗)
f ′(x∗)[δx (k)]2 + O([δx (k)]3) , with: δx (k) ∆
= x (k)−x∗
I The convergence of the Newton-Raphson method is guaranteedwhen initialized “sufficiently close” to the root x∗
I The rate of convergence is quadratic (if f ′(x∗) 6= 0):
δx (k) ∝ 10−1 → δx (k+1) ∝ 10−2 → δx (k+2) ∝ 10−4 → · · ·
I But, divergence may occur wheninitialized “too far” from the rootx∗
The Newton-Raphson method diverges
for the equation exp(x)−exp(−x)exp(x)+exp(−x) = 0 when
starting from x (0) < 1 or x (0) > 1
19
The Newton-Raphson method: globalization
Problem:How to increase the likelihood of the Newton-Raphson method toconvergence to a root x∗?
I Principle: Make the Newton-Raphson step less aggressive:
x (k+1) = x (k) − αf (x (k))
f ′(x (k))
with relaxation coefficient 0 < α ≤ 1
I In practice, adaptive strategies can be devised for selecting goodvalues of the α’s:
1. Set α = 1 (full Newton-Raphson step)
2. If improvement |f (x (k) + α∆x)| < |f (x (k))|, accept the step
3. Else, reduce α← 12α; return to 2
⇒ To be inserted in Step 3 of the Newton-Raphson algorithm(before the update)
20
The Secant method: a Newton-Raphson variant
Potential problem:I The derivatives of certain functions may be extremely difficult or
inconvenient to evaluateI An algebraic expression for f may not be available, e.g. a function
evaluated via a computer program!
Idea: Approximate f ′ by a backward finite divided difference,
f ′(x (k)) ≈ f (x (k))− f (x (k−1))
x (k) − x (k−1)
The Secant formula (with relaxation coefficient α)
x (k+1) = x (k) − αf (x (k))
f (x (k))− f (x (k−1))[x (k) − x (k−1)]
I A new iterate x (k+1) depends on the two previous iterates x (k),x (k−1)
I Two initial estimates are now needed, x (0) and x (−1)
I Only 1 function evaluation per iteration!21
The Secant method (cont’d)
Graphical depiction of the Secant method:
I Pro: Increased stability compared to Newton-Raphson
I Con: Slower (super-linear) convergence rate, δx (k+1) ∝ [δx (k)]1.6
22
The Secant method: algorithm
I Step 0: InitializationI Choose initial points x (0) and x (−1), relaxation coefficient 0 < α ≤ 1,
stopping tolerance εtol > 0, and maximum number of iterations NI Set counter k ← 0
I Step 1: Function evaluationI Compute function value f (x (k))
I Step 2: Stopping
I If˛f (x (k))
˛≤ εtol AND/OR
˛x(k)−x(k−1)
x(k)
˛≤ εtol OR k > N, STOP;
report x (k) as an approximate solution
I Step 3: Secant step
I Compute Secant step: ∆x = − f (x(k))
f (x(k))−f (x(k−1))[x (k) − x (k−1)]
I Update iterate: x (k+1) = x (k) + α ∆xI Increment counter k ← k + 1, and RETURN TO STEP 1
23
The Secant method: application
Apply the secant method to solve the nonlinear equationf (x) = exp(x) + x − 2 = 0, starting from x (0) = 5 and x (−1) = 10
it. x f (x) ∆x
-1 10.00 2.20e+04 -
0 5.000 1.51e+02 -3.46e-02
1 4.965 1.46e+02 -9.96e-01
2 3.969 5.49e+01 -5.98e-01
3 3.371 3.05e+01 -7.46e-01
4 2.625 1.44e+01 -6.71e-01
5 1.954 7.01e+00 -6.34e-01
6 1.320 3.06e+00 -4.92e-01
7 0.828 1.11e+00 -2.82e-01
8 0.546 2.71e-01 -9.07e-02
9 0.455 3.12e-02 -1.18e-02
10 0.443 9.76e-04 -3.80e-04
11 0.443 3.61e-06 -1.41e-06
12 0.443 4.20e-10 -1.64e-10
24
The multivariable Newton-Raphson method
A system ofnonlinear equations:
f1(x1, . . . , xn) = 0...
fn(x1, . . . , xn) = 0
The good news!
No conceptual difference:
1. Consider an affine approximation of thenonlinear function at a given iterate x(k)
2. Solve this approximation to get a newiterate x(k+1)
Illustration:
25
The multivariable Newton-Raphson method (cont’d)
I Consider the first equation, f1(x1, . . . , xn) = 0I Assume f1 and its 1st and 2nd partial derivatives are continuous
w.r.t. x1, . . . , xn
I At a given iterate x(k), by Taylor theorem:
f1(x(k)1 + ∆x1, x
(k)2 , . . . , x (k)
n ) = f1(x(k)1 , x
(k)2 , . . . , x (k)
n ) +∂f1∂x1
˛x(k)
∆x1 + O([∆x1]2)
≈ f1(x(k)1 , x
(k)2 , . . . , x (k)
n ) +∂f1∂x1
˛x(k)
∆x1
I By perturbing all the variables simultaneously:
f1(x(k)1 + ∆x1,x
(k)2 + ∆x2, . . . , x
(k)n + ∆xn) ≈ f1(x
(k)1 , x
(k)2 , . . . , x (k)
n )
+∂f1∂x1
˛x(k)
∆x1 +∂f1∂x2
˛x(k)
∆x2 + · · ·+ ∂f1∂xn
˛x(k)
∆xn
Idea: Choose the steps ∆x1, . . . ,∆xn such that
f1(x(k)1 , . . . , x (k)
n ) +∂f1∂x1
∣∣∣∣x(k)
∆x1 + · · ·+ ∂f1∂xn
∣∣∣∣x(k)
∆xn = 0
26
The multivariable Newton-Raphson method (cont’d)
I Apply the same procedure to all nonlinear equations:f1(x
(k)1 , . . . , x (k)
n ) +∂f1∂x1
∣∣∣∣x(k)
∆x1 + · · ·+ ∂f1∂xn
∣∣∣∣x(k)
∆xn = 0
...
fn(x(k)1 , . . . , x (k)
n ) +∂fn∂x1
∣∣∣∣x(k)
∆x1 + · · ·+ ∂fn∂xn
∣∣∣∣x(k)
∆xn = 0
I This can be rewritten in matrix form:∂f1∂x1
∣∣∣x(k)
· · · ∂f1∂xn
∣∣∣x(k)
.... . .
...∂fn∂x1
∣∣∣x(k)
· · · ∂fn∂xn
∣∣∣x(k)
︸ ︷︷ ︸
J(x(k))
∆x1
...∆xn
︸ ︷︷ ︸
∆x
= −
f1(x(k)1 , . . . , x
(k)n )
...
fn(x(k)1 , . . . , x
(k)n )
︸ ︷︷ ︸
f(x(k))
27
The multivariable Newton-Raphson method (cont’d)
I The Newton-step ∆x is obtained as the solution to the algebraiclinear equations
J(x(k)) ∆x = f(x(k))
I Requires the Jacobian matrix J(x(k)) to be nonsingular!
Newton-Raphson formula for multivariable nonlinear equations:
x(k+1) = x(k) −[J(x(k))
]−1
f(x(k))
When/how to stop this iterative procedure?
I Change in successive iterates is too small,‖x(k) − x(k−1)‖
‖x(k)‖< εtol
2-Norm:
vuut Pni=1[x
(k)i − x
(k−1)i ]2Pn
i=1[x(k)i ]2
< εtol, ∞-Norm:
max1≤i≤n
|x (k)i − x
(k−1)i |
max1≤i≤n
|x (k)i |
< εtol
I Function values are close enough to zero, ‖f(x(k))‖ < εtol
28
The multivariable Newton-Raphson method: algorithm
I Step 0: InitializationI Choose initial point x(0), stopping tolerance εtol > 0, and maximum
number of iterations NI Set counter k ← 0
I Step 1: Function and Jacobian matrix evaluationI Compute function values f(x(k)) and Jacobian matrix J(x(k))
I Step 2: Stopping
I If ‖f (x (k))‖ ≤ εtol AND/OR ‖x(k)−x(k−1)‖‖x(k)‖ ≤ εtol OR k > N, STOP;
report x(k) as an approximate solution
I Step 3: Newton-Raphson stepI Perform LU decomposition: J(x(k)) = LUI Compute Newton-Raphson step: L y = −f(x(k)), U∆x = yI Update iterate: x(k+1) = x(k) + ∆xI Increment counter k ← k + 1, and RETURN TO STEP 1
29
The multivariable Newton-Raphson method: application
Apply the Newton-Raphson method to solve the following system:{f1(x1, x2) = exp(−x1) + x2 = 0f2(x1, x2) = x1 + (x2)
2 + 3x2 = 0starting from:
{x
(0)1 = 0
x(0)2 = 0
it. x1 x2 f1(x) f2(x)‖∆x‖∞‖x‖∞
0 0.0000e+00 0.0000e+00 1.0000e+00 0.0000e+00 -1 7.5000e-01 -2.5000e-01 2.2237e-01 6.2500e-02 1.0000e+002 9.7624e-01 -3.6550e-01 1.1227e-02 1.3340e-02 2.3175e-013 9.8278e-01 -3.7426e-01 8.0441e-06 7.6778e-05 8.9158e-034 9.8275e-01 -3.7428e-01 1.8969e-10 3.9840e-10 3.2397e-055 9.8275e-01 -3.7428e-01 0.0000e+00 0.0000e+00 1.8709e-10
I Notice the very fast convergence close to the solution!
30
The multivariable Newton-Raphson method: application
31
The multivariable Newton-Raphson method: final words
Convergence:
I Guaranteed with initialized “close enough” to an actual solution
I Quadratic rate of convergence (close to a solution)
I But, divergence is possible
Globalization strategies:
I A relaxation coefficient 0 < α ≤ 1 can be used:
x(k+1) = x(k) + α ∆x
I Systematic ways of choosing good values of α can be implementedI Line-search methods
I Approximate schemes can be used to avoid computing the Jacobianmatrix J(x(k)) (or its inverse) at each iteration
I Broyden methods (e.g. BFGS and DFP schemes)I Inexact Newton methods
32