Optimally conditioned scaled ABS algorithms for linear systems

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 67, No, t, OCTOBER !990

Optimally Conditioned Scaled ABS Algorithms for Linear Systems I

E . S P E D I C A T O 2 A N D Z . Y A N G 3

Communicated by L. C. W. Dixon

Abstract. Using a strict bound of Spedicato to the condition number of bordered positive-definite matrices, we show that the scaling parameter in the ABS class for linear systems can always be chosen so that the bound of a certain update matrix is globally minimized. Moreover, if the scaling parameter is so chosen at every iteration, then the condition number itself is globally minimized. The resulting class of optimally conditioned algorithms contains as a special case the class of optimally stable algorithms in the sense of Broyden.

Key Words. ABS algorithms, bordered positive-definite matrices, condition numbers, linear systems, optimal conditioning, optimal stability.

1. Introduction

The ABS (Aba f fy -Broyden -Sped ica to ) class o f algori thms for the solut ion o f l inear systems was in t roduced in its unscaled form by Abaffy, Broyden, and Spedicato (Ref. 1) and in its scaled form by Abaffy and Spedicato (Ref. 2). Given the system

Ax = b, (1)

or equivalently

r ( x ) - A x - b = O, (2 )

where

x ~ R n, b ~ R " , A = ( a l , . . . , a , ~ ) r ~ R ~,", m<_n,

This work was done in the framework of research supported by MPI, Rome, Italy, 60% Program.

2 Professor, Department of Mathematics, University of Bergamo, Bergamo, Italy. 3 Assistant Professor, Department of Applied Mathematics, Dalian University of Technology,

Dalian, China.

t41 0022-3239/90/1000-0141505.00/0 © 1990 Plenum Publishing Corporation

142 JOTA: VOL. 67, NO. 1, O C T O B E R 1990

and assuming for simplicity of formulation that A has full rank, the ABS scaled algorithms have the following form.

Step 1. Given Xl c R n arbitrary, H1 ~ R "'n arbitrary nonsingular, set i = 1 .

Step 2. Compute r ~ = A x i - b . If ri =0 stop, xi solves the system. Otherwise, go to Step 3.

Step 3. Compute the iterates

pi = Hr~ zl, (3)

T T T txi = vi ri/pi A vi, (4)

X i + l = X i - - l , ,6 iPi , ( 5 )

H~+a = Hi - H, A rviwTH,, (6)

where zi, wi ~ R", vl E R m are arbitrary vectors satisfying

z~H~a Tv~ = p f A Tvi # O, (7)

w r H , A'tvi = 1. (8)

Step 4. Increment the index i by one and go to Step 2.

Among the properties of the ABS class we recall the following [see Abatty and Spedicato (Ref. 3) for proofs and discussion]:

(a) the algorithms are well defined; i.e., at each iteration it is possible to choose z~, w~, v~ satisfying (7) and (8);

(b) the solution is found in at most m iterations; the residual r~+l are orthogonal to the linearly independent vectors v a , . . . , vi;

(c) null(Hi) = range(Al_l), where A l - a = ( A r v l , . . . , Arv i -O; (d) nul l (HT) = range(W~_l), where V¢~_~ = ( w l , . . . , wi-a); (e) r a n g e ( H i ) = s p a n ( H i A r v i , . . . , HiArv , ) if m = n; otherwise,

Vm+a, • • . , v, are arbitrary vectors such that vl, • • •, v, are linearly independent;

(f) rank(Hi) = n - i + 1; (g) if Ha= I, then

Hi+, = I - A I B T ' W r , (9)

where Bi is the nonsingular and LU-decomposable matrix

T , (10) Bi = Wi Ai;

(h) if H, = I, then H~ = Hi.

The parameter vi is called the scaling parameter. The unscaled ABS class is obtained by setting vi = el, e~ being the unit vector in R". Notice that the scaled ABS class can be obtained by applying the unscaled ABS

JOTA: VOL. 67, NO. t, OCTOBER t990 143

algorithms on the scaled system V r A x = Vrb, An important algorithm in the unscaled ABS class is the Huang algorithm, originally considered by Huang in a paper (Ref. 4) which led to the development of the ABS class. It has the property that xi+~ is the solution of minimal Euclidean norm of the first i equations and is obtained by setting H1 = I and

zi = ai, wi = a T / a ~ H ~ a i . (11)

An equivalent version of the Huang algorithm, shown to be more stable numerically in the experiments of Abaffy and Spedicato (Ref. 5) and Spedicato and Vespucci (Ref. 6) and in the theoretical analysis of Yang (Ref. 7) and Broyden (Ref. 8), is the modified Huang algorithm, where

zl = H~ai and w~ --- z i / z T z i .

In the scaled ABS class, an important subclass consists of the so-called optimally stable algorithms, where HI -~ I and, assuming m = n,

vi = a - r p i , zi = Au i , wi = z i /pTp i , (12)

and ui c R" is arbitrary nonzero. It is possible to show [see Abaffy and Spedicato (Ref. 3)] that the inverse is not needed explicitly in the algorithm, since the update of Hi and the stepsize formula can be put in the form

Hi+! : Hi T r --PiPi / P i Pi, (13)

tx~ = r T u , / p T pi. (14)

Notice that the Huang algorithm is obtained by setting ui = el. The choice v~ =A- ' rp i implies the following matrix relation, with D nonsingular diagonal and V= (Vl , . . . , v,):

V r A A T V = D. (15)

Relation (15) was shown by Broyden (Ref. 9) to characterize, in the scaled ABS class, the algorithms with minimal error propagation in the computed value x,+l once the sequence xi is affected by a single error. It is called the Broyden optimal stability condition.

In this paper, we consider how to choose v~ and w~ so that a strict bound to a condition number associated with the matrix H~ is minimized. A formula for vi is obtained which implies that the algorithms satisfying Broyden optimal stability condition also satisfy the optimal conditioning criterion considered. For related work restricted to the unscated ABS class, see Spedicato (Ref. 10), Bodon (Ref. 11), and Deng and Spedicato (Ref. 12). Much of the algebraic development given here leading to Theorem 3.1 is just an immediate extension of the work done in Ref. 10.

144 JOTA: VOL. 67, NO. 1, OCTOBER 1990

2. Formulating the Minimum Condition Number Problem

The problem that we consider is how to select vi and wi so that condition (8) is satisfied and a strict bound to the Euclidean condition number (ratio of the largest to the smallest eigenvalue) of the symmetric positive-definite matrix G is minimized, where Ci is given by

C~ = BTBi. (16)

Here, Bi is defined by (10). Notice that the idea of minimizing a bound is common in numerical analysis and that, since Hi is singular for i > 1 and B~ is generally nonsymmetric, the given definition of G seems a natural one. For other definitions of G, including G = B~BT, see Bodon (Ref. 11).

We use the bound to bordered symmetric positive-definite matrices derived by Spedicato (Ref. 13); see also Spedicato and Burmeister (Ref. 14). Define T' by

T ' = (17) d T

where T c R "-1'"-~ is symmetric positive definite, d E R "-1,/z ~ R 1, and the positive definiteness of T' requires the condition

tx-dTT-~ d > O. (18)

Then, the bound has the form

cond(T') ~ max(cond(T), max,/.~;//z,/~/min/zi)Z(/x, T, d), (19)

where t h e / ~ are the eigenvalues of T and

Z(/z, T, d ) = [t + F(/~, T, d) ' /2]/[1-F(~, T, d)'/z], (20)

F(~, T, a) = aTT-ld//z. (21)

Since we do not assume knowledge of the eigenvalues, we minimize only the factor Z. A more refined bound, which does not depend on the eigenvalues, has been considered by Deng and Spedicato (Ref. 12) relating to the unscaled ABS class.

From (18), we obtain the restriction

0--< F(/z, T, a ) < 1. (22)

Hence, since Z is a strictly increasing function of F in the interval (0, 1), we can more conveniently minimize F instead of Z.

Observing that B~ has the form

r wITIA~-I wf-Ia'rvil Bi = T l [ w, A,_, WlrATv ' j , (23)

JOTA: VOL. 67, NO. 1, OCTOBER 1990 145

and defining u, y 6 R ~-1, D e R ~-1'~-~, pq ~ R 1 as follows:

u = W~_IATvi, (24)

y = (AI_I)TWi, (25)

D = W L ~ A I - t , (26)

tx~ = w T A rv,, (27)

we obtain after some calculations

F = ( D V u + l x l y ) r ( D r D + y y r ) - ~ ( D T u + ~ z ~ y ) / ( u r u + t x ~ ) . (28)

To further simplify F, define Q a R ~-1'~-1, q 6 R ~-~, ¢,/-~2, Iz3 ~ R ~ as follows:

Then, F

where

Q = ( D r D ) -1, (29)

q = Q D r u = D-~u , (30)

¢=u~u, (31)

.2:y~q, (32)

Ix3 = y rQy. (33)

takes the form

F = 1 - 1/G(/Xl, b~3), (34)

G ( ~ , lx3) = (fi, + tx~)(1 + ~3),

while the constraint (8) becomes

/xl - ~2 = 1.

(35)

(36)

Notice that, since Q is symmetric positive definite, tx3 -> 0, while fi + #~ = b~ is strictly positive from condition (18). Now, (22) implies that G(/xl,/x3) >- 1; hence, it follows from (34) that minimizing F is equivalent to minimizing G with the constraint (36).

3. Solving the Minimum Condition Number Problem

To simplify the algebra, we write the first-order Kuhn-Tucker conditions for the minimizer of G subject to (8) considering only w~ as variable and letting v~ be a free parameter. We show then that a suitable choice of v~ leads immediately to the global minimum of the bound, which corresponds also to a global minimum of the actual condition number, if the choice is made at every iteration.


The Kuhn-Tucker conditions have the form

( dG/ d~l)( d~a/ dw,) + ( dG/ am)( am/ dwi)

- 2 1 z ' ( d l x j d w i - dlz2/dw~) = 0, (37)

v T A H T w ~ = 1. (38)

In the above formulas, 2/z' is the Lagrange multiplier. For the computation of the partial derivatives and the gradients, we define al ~ R n and Q '6 R"'" as follows:

a~ = A l - l q = ( I - H i ) A T v,, (39)

Q' = AI-1Q(AI-1) T. (40)

Then, we obtain

dtza /dw, = ATv , , (41)

dl~2/ dw, = a~, (42)

d~3 / dw~ = 2 Q ' wi. (43)

Hence, (37) takes the form

(12 + tx 2) Q'w~ = - /z l (1 + ~3)A TVi +/~'H~ATv~. (44)

Now, we establish a property of the defined Kuhn-Tucker conditions.

Theorem 3.1. Let u and v be defined by (24) and (25), with vi and wi such that the Kuhn-Tucker conditions (38) and (44) are satisfied. Then, u = 0 implies y = 0; hence, y # 0 implies u # 0.

Proof. Proceeding as in Spedicato (Ref. 10), we obtain for /z' the formula

/x '=/x,(1 +/x3). (45)

Thus, (44) becomes

( f i + tz2)Q'w~ = - /z , (1 + ~3)a~; (46)

or, with

we have

u4 = - ~ , ( 1 + ~3)/(12 + t,~),

Q' wi = tz4al .

Suppose now that u = 0. Then,

w [ A T v i =0, k = 1 , . . . , i - 1 ;

(47)

JOTA: VOL. 67, NO. 1, OCTOBER 1990 147

hence, Arvl belongs to lrange(Wi_~). From property (d),

±range( W/-0 = ±nut l(H T) = range(Hi).

Hence, from property (e), Arvi must have the form

Arvi = ~ IxjHivj, (48) j=i

for some s c a l a r s / ~ . Substituting in (39) and using property (h), we obtain a ' i = 0 and, from (46), Q'wi=O. Since Q'w~=A'~_IQy and A;..1Q is of full rank from properties (c) and (f), it follows that y = 0. []

In Spedicato (Ref. 10), the Kuhn-Tucker conditions are sotved, for the unscaled ABS class, assuming y ¢ 0, u ¢ 0, which gives a problem of some technical complexity. Here, we show that, if u and y are zero, then G = 1; hence, Z attains its global minimum, Z = 1. Moreover, we show that it is always possible to force u = 0 by a suitable choice of the scaling parameter.

Theorem 3.2. Suppose that u = 0 in the Kuhn-Tucker conditions (38), (44). Then, the bounding function Z attains its global minimum Z = 1.

Proof. From Theorem 3.1, u = 0 implies y = 0, hence/2 = t'2 =/x3 = 0, and from (36) /xl = 1. Hence, G = 1, F = 0, and Z = 1, which is its global minimum. []

Theorem 3.3. Let m = n. Then, the vector u is zero if and only if the scaling vector vi has the following form, with si ~ R" arbitrary:

vi = A - rHisi. (49)

Proof. I f vl is given by (49), then

U = wT_IHiSi = (HTWi_I)TSi = O,

from property (d). Conversely, if u is zero, then Arvi is an arbitrary vector in

S =l range(Wi_l ) .

From properties (d) and (f), any vector in S can be written in the form Hisi for some si ~ R" or, since A is nonsingutar, in the form (49). []

Once v~ is given by (49), wi must still be determined by the Kuhn-Tucker conditions. We have the following theorem.


Theorem 3.4. If vi is given by (49), then the general solution of the Kuhn-Tucker equations (38), (44) has the form

w~ = H f ti, (50)

with t,- ~ R" and si such that

t [ H , si = 1. (51)

Proof. With vi given by (49), condition (38) becomes

wTH, s, = 1, (52)

while condition (44), which is equivalent to (47), reads

Q'w ,=A~_IQ(A~ 1)rw, = 0;

or, since A~_~Q has full rank,

(A~_~) TW, = 0. (53)

From properties (c) and (f), the general solution of (53) has the form (50), while (51) follows from (52) and property (h). []

It can be remarked that, in the case considered by Spedicato (Ref. 10), wi is shown to satisfy Eq. (38) and a modification of Eq. (53), where the right-hand side is a generally nonzero vector. Using (49) and (50), the update of Hf takes the form

H,+~ : Hi - H,s,tT Hi. (54)

By renaming ti as wi, it is clear from (54) that any wi subject to (52) is acceptable or, in other terms, that the parameter wi does not affect the conditioning once vi is chosen according to (49). Condition (5) on the parameter z~ reads now as

zrHis i ~ o. (55)

Conditions (52) and (55) can be satisfied for instance by the choices

zi = H, si, (56)

wl = z j z ~ z , , (57)

with si arbitrary not lying in null(Hi). Notice that choice (57) gives the solution of minimal Euclidean norm of Eq. (52). If si = a~, these choices define the modified Huang algorithm.

With the choices (56), (57), the update reads as follows:

H,+I = Hi - z,zT Hi / z~z , . (58)

JOTA: VOL. 67, NO. 1, OCTOBER 1990 t49

If Hi is nonsymmetric, H~+1 given by (58) is generally nonsymmetric too. If Hi is symmetric, then from property (h),

H Tzi = H~zi = S 2 si = Hisl = zi ;

hence, Hi+, is symmetric too. In such a case, choice (57) gives the solution of minimal Euclidean norm of the whole system (52), (53). Formula (58) becomes also identical with formula (13). Hence, we can state the following theorem.

Theorem 3.5. The matrices generated by the subclass of optimally stable algorithms in the sense of Broyden satisfy the optimal conditioning criterion.

We finally conclude by showing that, if the optimal conditioning criterion is used at every iteration, then the condition number of C~ is actually globally minimized,

Theorem 3.6. If the scaling parameter is chosen according to (49) at every iteration, then the matrix B~ is the identity matrix in Ri'~; hence, the condition number of C~ is globally minimized.

Proof. By choosing v; according to (49), we have u = y = 0 and txl = 1; hence, the last column and row in Bi+l are unit vectors in R ~+1. The theorem follows by induction, since

B1 = wT ATvl = w T H ~ A T v l = 1. D

4. Conclusions

We have shown that the optimal conditioning criterion based upon the bound of Spedicato leads to a special choice of the scaling parameter, defining explicitly a class of optimally conditioned updates. If the choice is applied at every iteration, then this class contains the updates of the Class of optimally stable algorithms in the sense of Broyden. Moreover, in such a case, it actually minimizes the condition number of the matrices considered.

References

1. ABAFFY, J., BROYDEN, C. G., and SPEDICATO, E., A Class of Direct Methods for Linear Equations, Numerische Mathematik, Vol. 45, pp. 361-376, 1984.


2. ABAFFY, J., and SPEDICATO, E., A Generalization of the ABS Algorithm for Linear Systems, University of Bergamo, DMSIA Report No. 85/4, 1985.

3. ABAFFY, J., and SPEDICATO, E., ABS Projection Algorithms: Mathematical Techniques for Linear and Nonlinear Equations, Ellis Horwood, Chichester, England, 1989.

4. HUANG, H. Y., A Direct Method for the General Solution of a System of Linear Equations, Journal of Optimization Theory and Applications, Vol. 16, pp. 429- 445, 1975.

5. ABAFFY, J., and SPEDICATO, E., Numerical Experiments with the Symmetric Algorithm in the ABS Class for Linear Systems, Optimization, Vol. 18, pp. 197- 212, 1987.

6. SPED1CATO, E., and VESPUCCI, M. Z., Computational Performance of the Implicit Gram-Schmidt-Huang Algorithm for Linear Algebraic Systems, University of Bergamo, DMSIA Report No. 87/4, 1987.

7. YANG, Z., On the Numerical Stability of the Huang and the Modified Huang Algorithms and Related Topics, Dalian University of Technology, RABSCA Report No. 5, 1988.

8. BROYDEN, C. G., On the Numerical Stability of Huang' s Update, University of Bergamo, DMSIA Report No. 89/19, 1989.

9. BROYDEN, C. G., On the Numerical Stability of Huang's and Related Methods, Journal of Optimization Theory and Applications, Vol. 47, pp. 401-412, 1985.

10. SPEDICATO, E., Optimal Conditioning Parameter Selection in the ABS Class for Linear Systems, University of Wiirzburg, Angewandte Mathematik, Report No. 203, 1987.

11. BODON, E., Globally Optimally Conditioned Updates in the ABS Class for Linear Systems, University of Bergamo, DMSIA Report No. 88/1, 1988.

12. DENG, N. Y., and SPEDICATO, E., Optimal Conditioning Parameter Selec- tion in the ABS Class through a Rank-Two Update Formulation, University of Bergamo, DMSIA Report No. 88/18, 1988.

13. SPEDICATO, E., A Bound to the Condition Number of Bordered Positive-Definite Matrices, University of Bergamo, DMSIA Report No. 87/1, 1987.

14. SPEDICATO, E., and BURMEISTER, W., A Strict Bound to the Condition Number of Bordered Positive-Definite Matrices, Computing, Vol. 40, pp. 181-183, 1988.

Optimally conditioned scaled ABS algorithms for linear systems

Documents

Transcript of Optimally conditioned scaled ABS algorithms for linear systems