Top Banner
To appear in SIAM J. Scientific Computing. A Quasi-Minimal Residual Variant of the Bi-CGStab Algorithm for Nonsymmetric Systems T. F.Chan , E. Gallopoulos , V.Simoncini , T. Szeto and C. H. Tong February 1993 (revised) CSRD Report No. 1231 Dept. of Mathematics University of California - Los Angeles Los Angeles, CA 90024 Center for Supercomputing Research and Development University of Illinois at Urbana-Champaign 1308 West Main Street Urbana, Illinois 61801 Center Comp. Eng. Sandia National Laboratory Livermore, CA 94551
12

A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

Apr 24, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

To appear in SIAM J. Scientific Computing.

A Quasi-Minimal Residual Variant of theBi-CGStab Algorithm for NonsymmetricSystems

T. F. Chany, E. Gallopoulos�, V. Simoncini�, T. Szetoy and C. H. TongzFebruary 1993 (revised)

CSRD Report No. 1231y Dept. of MathematicsUniversity of California - Los AngelesLos Angeles, CA 90024� Center for Supercomputing Research and DevelopmentUniversity of Illinois at Urbana-Champaign1308 West Main StreetUrbana, Illinois 61801z Center Comp. Eng.Sandia National LaboratoryLivermore, CA 94551

Page 2: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

A QUASI-MINIMAL RESIDUAL VARIANT OF THE BI-CGSTAB ALGORITHMFOR NONSYMMETRIC SYSTEMS

T. F. CHAN�, E. GALLOPOULOSy, V. SIMONCINIz, T. SZETOx AND C. H. TONG{Abstract. Motivated by a recent method of Freund [3], who introduced a quasi-minimal residual (QMR)

version of the CGS algorithm, we propose a QMR variant of the Bi-CGSTAB algorithm of van der Vorst, which wecall QMRCGSTAB for solving nonsymmetric linear systems. The motivation for both QMR variants is to obtainsmoother convergence behavior of the underlying method. We illustrate this by numerical experiments, whichalso show that for problems on which Bi-CGSTAB performs better than CGS, the same advantage carries over toQMRCGSTAB.

Key words. conjugate gradients, Lanczos algorithm, quasi-minimum residual, nonsymmetric linear systems,iterative methods, BCG, CGS, QMR, Bi-CGSTAB, QMRCGSTAB

AMS(MOS) subject classifications. 65F10,65Y20

1. Introduction. In this note we propose a variation of the Bi-CGSTAB algorithm of vander Vorst [18] for solving the linear systemAx = b;(1)

where A is a nonsymmetric sparse matrix of order n.Various attempts have been made in the last forty years to extend the highly successful

conjugate gradient (CG) algorithm to the nonsymmetric case [4]. One such natural extensionis what is currently called the Bi-Conjugate Gradient algorithm (BCG) [9][1] Although BCG

is still quite competitive today, it also has several well-known drawbacks. Among these are(1) the need for matrix-vector multiplications with AT (which can be inconvenient as well asdoubling the number of matrix-vector multiplications compared to CG for each increase in thedegree of the underlying Krylov subspace), (2) the possibility of breakdowns and (3) erraticconvergence behavior.� Dept. of Math., Univ. of Calif. Los Angeles, Los Angeles, CA 90024. E-mail: [email protected] supported by the Dept. of Energy grant DE-FG03-87ER25037, the Office of Naval Research grantN00014-90-J-1695, the National Science Foundation grants ASC90-03002 and ASC92-01266 and the ArmyResearch Office grant DAAL03-91-G-150.y Department of Computer Science and Center for Supercomputing Research and Development, Univ.of Illinois at Urbana-Champaign, Urbana, IL 61801. E-mail [email protected]. This researchwas supported by the the Department of Energy grant DOE DE-FG02-85ER25001 and the National ScienceFoundation under grants NSF CCR-9120105 and CCR-9024554. Additional support was provided by the Stateof Illinois Department of Commerce and Community Affairs, State Technology Challenge Fund under grant No.SCCA 91-108.z Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Urbana,Illinois 61801. E-mail [email protected]. Supported by fellowship #2043597 from the ConsiglioNazionale delle Ricerche, Italy.x Dept. of Math., Univ. of Calif. Los Angeles, Los Angeles, CA 90024. E-mail: [email protected] by same grants as the first author.{ Ct. Cmp. Eng., Sandia Nat. Lab., Livermore, CA 94551. E-mail: [email protected].

1

Page 3: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

Many recently proposed methods can be viewed as improvements over some of thesedrawbacks of BCG. The most notable of these is the ingenious conjugate gradients-squaredmethod (CGS) proposed by Sonneveld [14], which cures the first drawback mentioned aboveby computing the square of the BCG polynomial without requiring AT . Hence when BCG

converges, CGS is an attractive, faster converging alternative. However, this relation betweenthe residual polynomials also causes CGS to behave even more erratically than BCG,particularlyin near-breakdown situations for BCG [8][18]. These observations led van der Vorst [18] tointroduce Bi-CGSTAB, a more smoothly converging variant of CGS. The main idea is to forma product of the BCG polynomial with another, locally defined polynomial. The Bi-CGSTAB

method was further refined by Gutknecht [7] to handle complex matrices and also lead tobetter convergence for the case of complex eigenvalues. Nevertheless, although the Bi-CGSTAB

algorithms were found to perform very well compared to CGS in many situations, there arecases where convergence is still quite erratic (see, for example, Section 4 and [12]).

In a recent paper [3], Freund proposed a new version of CGS, called TFQMR, which“quasi-minimizes” ([6]) the residual in the space spanned by the vectors generated by the CGS

iteration. Numerical experiments show that TFQMR in most cases retains the good convergencefeatures of CGS while correcting its erratic behavior. The transpose free nature of TFQMR, itslow computational cost and its smooth convergence behavior make it an attractive alternativeto CGS. On the other hand, since the square of the residual polynomial for BCG is still in thespace being quasi-minimized, in many practical examples CGS and TFQMR converge in aboutthe same number of steps. We note however that in contrast to CGS, the asymptotic behaviorof TFQMR has been analyzed [2]. It is also well-known that the CGS residual polynomialcan be quite polluted by round-off error [16]. One possible remedy would be to combineTFQMR with a look-ahead Lanczos technique as was done for the original QMR method [5]. Inthis paper, we take an alternative approach by deriving quasi-minimum residual extensionsto Bi-CGSTAB. We call the basic method QMRCGSTAB and illustrate its smoothed convergenceby means of numerical experiments.

It may appear redundant to combine the local minimization in Bi-CGSTAB with a globalquasi-minimization. However, our view is that the local minimization is secondary in natureand is only used as a way of generating residual polynomials in the appropriate Krylovsubspace over which the residual is being quasi-minimized. In fact, this view allows ussome flexibility in modifying the local minimization step in Bi-CGSTAB which leads to otherquasi-minimal residual variants. Although we use extensively notation introduced in [18] foralgorithm Bi-CGSTAB, for the sake of brevity we refer to that paper for a description of themethod.

2. The QMRCGSTAB Algorithm. The algorithm proposed in this paper is inspired byTFQMR in that it applies the quasi-minimization principle to the Bi-CGSTAB method, in thesame way that TFQMR is derived from CGS. During each step of Bi-CGSTAB, the followingvector relations hold: si = ri�1 � �iApi; ri = si � !iAsi;(2)

where �i is the same as the analogous coefficient in BCG and !i is chosen by a local steepestdescent principle. Note that xi is completely determined by �i and !i. Instead, our algorithm

2

Page 4: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

uses Bi-CGSTAB to generate the vectors pi and si, but chooses xi by quasi-minimizing theresidual over their span. Let Yk = fy1; y2; :::; ykg, where y2l�1 = pl for l = 1; : : : ; [(k+ 1)=2]and y2l = sl for l = 1; : : : ; [k=2], ( [k=2] is the integer part of k=2). In the same way,let Wk+1 = fw0; w1; : : : ; wkg with w2l = rl for l = 0; : : : ; [k=2] and w2l�1 = sl for l =1; : : : ; [(k + 1)=2]. We also define f�1; �2; : : : ; �kg, as �2l = !l for l = 1; : : : ; [(k + 1)=2] and�2l�1 = �l for l = 1; : : : ; [(k + 1)=2]: In this case, for each column of Wk+1 and Yk , Eq. (2)may be written as Ayj = (wj�1 � wj)��1j ; j = 1; : : : ; k;(3)

or, using matrix notation, AYk = Wk+1Ek+1;where Ek+1 is a (k+1)�k bidiagonal matrix with diagonal elements ��1j and lower diagonalelements ���1j .

It can easily be checked that the degree of the polynomials corresponding to the vectorsri; si and pi are 2i; 2i� 1 and 2i� 2 respectively. Therefore, span(Yk) = span(Wk) = Kk�1,where Kk is the Krylov subspace of degree k generated by r0. The main idea in QMRCGSTAB

is to look for an approximation to the solution of Eq. (1), using the Krylov subspace Kk�1, inthe form xk = x0 + Ykgk; with gk 2 Ck:Hence, we may write the residual rk = b�Axk asrk = r0 �AYkgk = r0 �Wk+1Ek+1gk:Using the fact that the first vector of Wk+1 is indeed r0, it follows thatrk = Wk+1(e1 � Ek+1gk);where e1 is the first vector of the canonical basis. Since the columns of Wk+1 are not normal-ized, it was suggested in [3] to use a (k+ 1)� (k+ 1) scaling matrix �k+1=diag(�1; ::; �k+1),with �i = kwik, in order to make the columns of Wk+1 to be of unit norm. Thenrk = Wk+1��1k+1�k+1(e1 � Ek+1gk) = Wk+1��1k+1(�1e1 �Hk+1gk)(4)

with Hk+1 = �k+1Ek+1.The quasi-minimal residual approach consists of the minimization of k�1e1�Hk+1gk for

some g 2 Rk . In Section 3 we will introduce a variant of QMRCGSTAB which generates Wk+1

with pairwise orthogonal columns.The least squares minimization of k�1e1 �Hk+1gk is solved using QR decomposition ofHk+1. This is done in an incremental manner by means of Givens rotations. Since Hk+1

is lower bidiagonal, only the rotation of the previous step is needed. We refer to [3] for adetailed description of the QR decomposition procedure.

The pseudocode for the QMRCGSTAB algorithm is as follows, in which the Givens rotationsused in the QR decomposition are written out explicitly:

3

Page 5: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

Algorithm QMRCGSTAB(A; b; x0; �)(1) Initializationr0 = b�Ax0

choose ~r0 such that (~r0; r0) 6= 0p0 = v0 = d0 = 0�0 = �0 = !0 = 1; � = kr0k; �0 = 0; �0 = 0(2) for k = 1; 2; � � � do�k = (~r0; rk�1);�k = (�k�k�1)=(�k�1!k�1)pk = rk�1 + �k(pk�1 � !k�1vk�1)vk = Apk�k = �k=(~r0; vk)sk = rk�1 � �kvk(2.1) First quasi-minimization and update iterate~�k = kskk=� ; c = 1=q1 + ~�2k; ~� = � ~�kc~�k = c2�k~dk = pk + �2k�1�k�1�k dk�1~xk = xk�1 + ~�k ~dk(2.2) compute tk; !k and update rktk = Ask!k = (sk; tk)=(tk; tk)rk = sk � !ktk(2.3) Second quasi-minimization and update iterate�k = krkk=~� ; c = 1=q1 + �2k; � = ~��kc�k = c2!kdk = sk + ~�2k ~�k!k ~dkxk = ~xk + �kdk

If xk is accurate enough, then quit(3) end

To check the convergence the estimate krkk � pk + 1j� j was used, where rk denotesthe QMRCGSTAB residual at step k [3].

Note that the cost per iteration is slightly higher than for Bi-CGSTAB, since two additionalinner products are needed to compute the elements of �k+1. A more detailed discussion oncomputational costs is given in Section 4.

3. Some Variants of QMRCGSTAB. The use of quasi-minimization in the “productalgorithms” (such as CGS and Bi-CGSTAB) introduces some flexibility. For example, theunderlying product algorithm need not be constrained to generate a residual polynomial thathas small norm since, presumably the quasi-minimization step will handle that. Instead, thebasic iteration can be viewed as only generating a set of vectors spanning the Krylov subspaceover which the quasi-minimization is applied. This leads us to several variants of QMRCGSTAB

which we will briefly describe. Note however that only one of these variants will be used inthe numerical experiments.

4

Page 6: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

We make two observations on the QMRCGSTAB method:1. It is not crucial that the steepest descent step reduces the norm of the residual as long

as it increases the degree of the Krylov subspace associated with Wk+1.2. IfWk+1 were orthogonal, then quasi-minimization becomes true minimization of the

residual.Therefore, it is natural to choose !i to make Wk+1 “more orthogonal”. For example, one canchoose !i to make ri orthogonal to si and Wk pairwise orthogonal. This leads to the formula:!i = (si; si)(si; ti)which replaces the corresponding formula in Algorithm QMRCGSTAB. We call this variantQMRCGSTAB2. We note that since the inner-product (si; si) is already needed to compute ~�i,we save one inner-product compared to QMRCGSTAB.

We also note that similarly to Bi-CGSTAB, both QMRCGSTAB and QMRCGSTAB2 breakdown if (si; ti) = 0 which is possible if A is indefinite (in fact it is always true if A isskew symmetric.) This is an additional breakdown condition over that of BCG. One possiblestrategy to overcome this is to set a lower bound for the quantity j(si; ti)j. However, formatrices with large imaginary parts, Gutknecht [7] observed that Bi-CGSTAB does not performwell because the steepest descent polynomials have only real roots and thus cannot be expectedto approximate the spectrum well. In principle, it is possible to derive a quasi-minimal residualversion of Gutknecht’s variant of Bi-CGSTAB, but we shall not pursue that here.

4. Numerical experiments. We next compare the performance of the QMRCGSTAB vari-ants with that of Bi-CGSTAB, TFQMR, and CGS.

Table 1 shows the cost per step of the methods under discussion, excluding the cost forcomputing the residual norm which is the same for all methods.

TABLE 1Cost per step for each method

inner DAXPY matrix-vectorproducts operations multiplications

Bi-CGSTAB 4 6 2CGS 2 71 2QMRCGSTAB 6 8 2QMRCGSTAB2 5 8 2TFQMR 4 10 2

In the sequel we present experiments to show that QMRCGSTAB indeed achieves a smooth-ing of the residual compared to Bi-CGSTAB. Note however that, because the Bi-CGSTAB methodalready improves the erratic residual convergence of BCG, the effect of QMRCGSTAB is not asimpressive as the one of TFQMR on the residual of CGS.

1 Strictly speaking, one of the operations is a simple vector addition. This must be taken intoaccount if floating point operations were to be counted.

5

Page 7: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

Unless stated otherwise, in all examples, the right hand side b was generated as a randomvector with values distributed uniformly in (0; 1), and the starting vector x0 was takento be zero. All matrices arising from a partial differential operator were obtained usingcentered, second order finite differences. The methods were compared on the basis of thenumber of iterations necessary to achieve relative residual krkkkr0k < 10�8 with rk = b � Axkbeing the true residual. Hence, the figures were built with the abscissae representing thenumber of iterations and the ordinates representing krkkkr0k graded with a logarithmic scale.Experiments were conducted using a Beta test version of Matlab 4.0 [10] running on a SunSparc workstation.

Example 1. This example was taken from [14] and corresponds to the discretization ofthe convection-diffusion operatorL(u) = �"4u+ cos(�)ux + sin(�)uy(5)

on the unit square with homogeneous Dirichlet conditions on the boundary and parameters" = 0:1 and� = �30�, using 40 grid points per direction, yielding a matrix of ordern = 1600.Fig. 1 shows the convergence histories, from which we can see the smoothing effect of quasi-minimization on the CGS and Bi-CGSTAB residuals. We see that Bi-CGSTAB and its smoothedcounterparts converge slightly faster than CGS and TFQMR, with QMRCGSTAB2 showing thebest performance by a small margin.

0 10 20 30 40 50 60 70 80 90 10010

-10

10-5

100

105

iteration number

rela

tive r

esi

dual

norm

cgs

tfqmr

qmrcgstab2

qmrcgstab

bi-cgstab

FIG. 1. Example 1: 2D conv-diff. operator (5)

Example 2. This example was taken from [17] and corresponds to the discretization of� (Dux)x � (Duy)y = 1(6)6

Page 8: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

on the unit square with homogeneous boundary conditions. We used a coarser grid than theone considered in [17], that is 50 grid points per direction yielding a matrix of order n = 2500.Parameter D takes the value D = 105 in 0 � x; y � 0:75, D = 0:1 in 0:75 < x; y � 1,and D = 1 everywhere else. Left diagonal preconditioning was applied. In [17], this matrixwas used to illustrate the better convergence of Bi-CGSTAB over CGS. We see from Fig. 2 thatthis advantage carries over to the smoothed versions. Furthermore, even though the matrixis symmetric positive definite and hence CG is applicable, as shown in Fig. 2 the methodstagnates. This is due to the fact that for this operator, the computed direction vectors ofconjugate gradients methods rapidly lose orthogonality [16]. We note that in order to makecost comparisons meaningful, the CG curve was plotted so that each “iteration” correspondsto two true CG iterations, i.e. two matrix-vector multiplications.

0 20 40 60 80 100 120 14010

-8

10-6

10-4

10-2

100

102

104

iteration number

rela

tive r

esi

dual

norm

qmrcgstabqmrcgstab2

bi-cgstab

cg

cgs

tfqmr

FIG. 2. Example 2: 2D operator with discontinuous coefficients. Every point on the CG curve refers to two CG

iterations.

Example 3. This example comes from the discretization of the convection-diffusionequation L(u) = �4u+ (xux + yuy) + �u(7)

on the unit square where = 100; � = �100, for a 63 � 63 grid, yielding a matrix of ordern = 3969. No preconditioning was used. In this example, we see the CGS-based methodsconverge a little faster than Bi-CGSTAB and QMRCGSTAB, but the pairwise orthogonal variant,QMRCGSTAB2, is the fastest. See Fig. 3.

7

Page 9: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

0 20 40 60 80 100 120 140 160 18010

-10

10-5

100

105

1010

iteration number

rela

tive r

esi

dual

norm

cgs

tfqmr

qmrcgstab2

qmrcgstab

bi-cgstab

FIG. 3. Example 3: 2D conv-diff. operator (7)

Example 4. Figure 4 shows the results of a 3-dimensional version of Example 3 withoutpreconditioning: L(u) = �4u+ (xux + yuy + zuz) + �u(8)

on the unit cube where � = �100, and = 50 for a 15 � 15 � 15 grid, yielding a matrix oforder n = 3375.

We note that in this example the improvement caused by Bi-CGSTAB over CGS and TFQMR

is impressive. Therefore it is not surprising that there is only little additional improvementbrought by the variants proposed in this paper. We note that for this operator, the use ofcentered differences and large values of are unfavorable for Bi-CGSTAB-type methods, sincethe resulting matrices would have pronounced skew-symmetric component, and eigenvalueswith large imaginary parts [7]; different discretization methods would be more attractive [13].

Example 5. The next example illustrates how all methods can be affected by theconditioning of the generated polynomial. Matrix A is a modification of an example presentedin [11], A = In=2 � 1�25 100

!(9)

i.e., A is an n � n block diagonal matrix with 2 � 2 blocks and n = 40. We choseb = (1 0 1 0 � � �)T , and ~r0 = r0. For such a b the norm of the resulting BCG polynomial

8

Page 10: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

0 5 10 15 20 25 30 35 40 4510

-10

10-5

100

105

cgs

tfqmr

iteration number

rela

tive r

esi

dual

norm

qmrcgstab2qmrcgstab

bi-cgstab

FIG. 4. Example 4: 3D conv-diffusion operator (8)

satisfies k'nk = O(��1): Thus, k'2nk = O(��2) in the squared methods and we can foreseenumerical problems when � is small.

Each entry of Table 2 shows (i) the number of correct digits, d, in the relative residualobtained after running each algorithm until the relative residual dropped below 10�8 butwithout exceeding 20 matrix vector multiplications, and (ii) in parentheses, the number ofmatrix vector multiplications, mv, that is a number, not greater than 20, needed to achieve arelative residual of 10�d.

In exact arithmetic, finite termination occurs after the second BCG polynomial '2 iscomputed in both the CGS and Bi-CGSTAB algorithms. We see from Table 2 that all methodsbehave equally well for � = 1:0. As � decreases, round-off error causes CGS and TFQMR, whichare based on squaring, to fail or not to converge within the expected time. Furthermore, bothCGS and TFQMR lose about twice as many digits as Bi-CGSTAB and its quasi-minimal variants.We also mark the instances of the quasi-minimal variants whose residuals stagnate beforethe maximum number of iterations has been reached. We note that although the example iscontrived, it does justify the implementation of a QMRCGSTAB-type method.

We finally observe that experiments using several of the methods discussed herein, albeitusing another naming convention, were presented in [15].

5. Conclusions and future work. We have derived two QMR variants of Bi-CGSTAB.Our motivation for these methods was to inherit any potential improvements on performanceBi-CGSTAB offers over CGS, while at the same time provide a smoother convergence behavior.

9

Page 11: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

TABLE 2Example 5: Correct digits and matrix vector multiplications at termination: d(mv). A max. 20 matrix vectormult. allowed. �

Method 1.0 10�4 10�8 10�12

CGS 14(4) 5(4)z �3(20)� �1(4)yTFQMR 13(3) 5(4)z 1(20)z 1(20)z

Bi-CGSTAB 16(3) 12(3) 7(3)z 3(3)zQMRCGSTAB 16(3) 12(3) 7(3)z 3(3)z

QMRCGSTAB2 16(3) 12(3) 7(3)z 3(3)z* Oscillatory behavior observed.z Residual stagnates before max. number of mv’s was reached.y Iterations stopped when division by zero was encountered.

We have shown numerically that this is indeed true for many realistic problems. Although intheir present form, the two proposed methods still suffer from some numerical problems, theyhave many desirable properties: they are transpose-free, they use short recurrences, they makeefficient use of matrix-vector multiplications and demonstrate smooth convergence behavior.

6. Acknowledgments. We are grateful to J. M. Hammond, C. Moler, and The Math-works Inc. for providing us with the Beta Test version of Matlab 4.0. [10]. We also thank R.Freund and the referees for providing us with very useful suggestions and corrections.

10

Page 12: A quasi-minimal residual variant of the Bi-CGSTAB algorithm for nonsymmetric systems

REFERENCES

[1] R. FLETCHER, Conjugate gradient methods for indefinite linear systems, in Proc. Dundee Biennial Conf.Numer. Anal., G. A. Watson, ed., vol. 506 of Lect. Notes Math., Springer-Verlag, Berlin, 1976,pp. 73–89.

[2] R. W. FREUND, Quasi-kernel polynomials and convergence results for quasi-minimal residual iterations,in Numerical Methods in Approximation Theory, Vol. 9, D. Braess and L. L. Schumaker, eds.,Birkhauser Verlag, Basel, 1992, pp. 77–95.

[3] , A transpose-free quasi-minimal residual algorithm for non-Hermitian linear systems, SIAM J. Sci.Comput., (1993). to appear.

[4] R. W. FREUND, G. H. GOLUB, AND N. M. NACHTIGAL, Iterative solution of linear systems, Acta Numerica,1 (1992), pp. 57–100.

[5] R. W. FREUND, M. H. GUTKNECHT, AND N. M. NACHTIGAL, An implementation of the look-ahead Lanczosalgorithm for non-hermitian matrices: Part I, Tech. Rep. 90.45, RIACS, Nov. 1990.

[6] R. W. FREUND AND N. M. NACHTIGAL, QMR: A quasi-minimal residual method for non-Hermitian linearsystems, Numer. Math., 60 (1991), pp. 315–339.

[7] M. H. GUTKNECHT, Variants of BiCGStab for matrices with complex spectrum, tech. rep., EidgenossischeTechnische Hochschule, Zurich, Aug. 1991. IPS Research Report 91-14.

[8] W. D. JOUBERT AND T. A. MANTEUFFEL, Iterative methods for nonsymmetric linear systems, in IterativeMethods for Large Linear Systems, D. R. Kincaid and L. J. Hayes, eds., Academic Press, Boston,1990, pp. 149–171.

[9] C. LANCZOS, Solution of linear equations by minimized iterations, J. Res. Natl. Bur. Stand., 49 (1952),pp. 33–53.

[10] THE MATHWORKS, INC., MATLAB User’s Guide, Natick, Mass. 01760, beta 3 ed., Feb. 1991.[11] N. M. NACHTIGAL, S. C. REDDY, AND L. N. TREFETHEN, How fast are nonsymmetric matrix iterations?,

SIAM J. Matrix Anal. Appl., 13 (July 1992), pp. 778–795.[12] C. POMMERELL AND W. FICHTNER, PILS: An iterative linear solver package for ill-conditioned systems,

in Proc. Supercomputing’91, Albuquerque, New Mexico, November 1991, IEEE, pp. 588–599.[13] A. SEGAL, Aspects of numerical methods for elliptic singular perturbation problems, SIAM J. Sci. Stat.

Comput., 3 (September 1982), pp. 327–349.[14] P. SONNEVELD, CGS, a fast Lanczos-type solver for nonsymmetric linear systems, J. Sci. Stat. Comput.,

10 (Jan. 1989), pp. 36–52.[15] C. H. TONG, A comparative study of preconditioned Lanczos methods for nonsymmetric linear systems,

Tech. Rep. SAND91-8240 UC404, Sandia National Laboratories, Albuquerque, March 1992.[16] H. A. VAN DER VORST, The convergence behavior of preconditioned CG and CG-S in the presence of

rounding errors, in Preconditioned Conjugate Gradient Methods, O. Axelsson and L. Y. Kolotilina,eds., vol. 1457 of Lect. Notes in Math., Springer-Verlag, Berlin, 1990, pp. 126–136. Proc. of aconference held in Nijmegen, the Netherlands, 1989.

[17] , Bi-CGSTAB: A fast and smoothly converging variant of bi-CG for the solution of nonsymmetriclinear systems, Tech. Rep. 633, Dept. of Math., University of Utrecht, Dec. 1990.

[18] , Bi-CGSTAB: A fast and smoothly converging variant of bi-CG for the solution of nonsymmetriclinear systems, SIAM J. Sci. Stat. Comput., 13 (March 1992), pp. 631–644.

11