The block least squares method for solving nonsymmetric linear systems with multiple right-hand...

11
The block least squares method for solving nonsymmetric linear systems with multiple right-hand sides S. Karimi * , F. Toutounian Department of Mathematics, Ferdowsi University of Mashhad, P.O. Box 1159-91775, Mashhad, Iran Abstract In this paper, we present the block least squares method for solving nonsymmetric linear systems with multiple right- hand sides. This method is based on the block bidiagonalization. We first derive two algorithms by using two different convergence criteria. The first one is based on independently minimizing the 2-norm of each column of the residual matrix and the second approach is based on minimizing the Frobenius norm of residual matrix. We then give some properties of these new algorithms. Finally, some numerical experiments on test matrices from Harwell–Boeing collection are presented to show the efficiency of the new method. Ó 2005 Elsevier Inc. All rights reserved. Keywords: LSQR method; Bidiagonalization; Block methods; Iterative methods; Multiple right-hand sides 1. Introduction Many applications require the solution of several sparse systems of equations Ax ðiÞ ¼ b ðiÞ ; i ¼ 1; 2; ... ; s ð1Þ with the same coefficient matrix and different right-hand sides. When all the b (i) ’s are available simultaneously, Eq. (1) can be written as AX ¼ B; ð2Þ where A is an n · n nonsingular and nonsymmetric real matrix, B and X are n · s rectangular matrices whose columns are b (1) , b (2) , ... , b (s) and x (1) , x (2) , ... , x (s) , respectively. In practice, s is of moderate size s n. Instead of applying an iterative method to each linear system, it is more efficient to use a method for all the systems simultaneously. In the last years, generalizations of the classical Krylov subspace methods have been deve- loped. The first class of these methods contains the block solvers such as the BL-BCG algorithm [2,10], the 0096-3003/$ - see front matter Ó 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2005.11.038 * Corresponding author. E-mail addresses: [email protected], [email protected] (S. Karimi), [email protected] (F. Toutounian). Applied Mathematics and Computation 177 (2006) 852–862 www.elsevier.com/locate/amc

Transcript of The block least squares method for solving nonsymmetric linear systems with multiple right-hand...

Applied Mathematics and Computation 177 (2006) 852–862

www.elsevier.com/locate/amc

The block least squares method for solving nonsymmetriclinear systems with multiple right-hand sides

S. Karimi *, F. Toutounian

Department of Mathematics, Ferdowsi University of Mashhad, P.O. Box 1159-91775, Mashhad, Iran

Abstract

In this paper, we present the block least squares method for solving nonsymmetric linear systems with multiple right-hand sides. This method is based on the block bidiagonalization. We first derive two algorithms by using two differentconvergence criteria. The first one is based on independently minimizing the 2-norm of each column of the residual matrixand the second approach is based on minimizing the Frobenius norm of residual matrix. We then give some properties ofthese new algorithms. Finally, some numerical experiments on test matrices from Harwell–Boeing collection are presentedto show the efficiency of the new method.� 2005 Elsevier Inc. All rights reserved.

Keywords: LSQR method; Bidiagonalization; Block methods; Iterative methods; Multiple right-hand sides

1. Introduction

Many applications require the solution of several sparse systems of equations

0096-3

doi:10

* CoE-m

AxðiÞ ¼ bðiÞ; i ¼ 1; 2; . . . ; s ð1Þ

with the same coefficient matrix and different right-hand sides. When all the b(i)’s are available simultaneously,Eq. (1) can be written as

AX ¼ B; ð2Þ

where A is an n · n nonsingular and nonsymmetric real matrix, B and X are n · s rectangular matrices whosecolumns are b(1),b(2), . . . ,b(s) and x(1),x(2), . . . ,x(s), respectively. In practice, s is of moderate size s� n. Insteadof applying an iterative method to each linear system, it is more efficient to use a method for all the systemssimultaneously. In the last years, generalizations of the classical Krylov subspace methods have been deve-loped. The first class of these methods contains the block solvers such as the BL-BCG algorithm [2,10], the

003/$ - see front matter � 2005 Elsevier Inc. All rights reserved.

.1016/j.amc.2005.11.038

rresponding author.ail addresses: [email protected], [email protected] (S. Karimi), [email protected] (F. Toutounian).

S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862 853

BL-GMRES algorithm introduced in [15], the BL-QMR algorithm [4], and recently the BL-BiCGSTABalgorithm [3]. The second class contains the global GMRES [7] algorithm and global Lanczos-based methods[8]. A third class of methods use a single linear system named the seed system and then consider the corre-sponding Krylov subspace. The initial residuals of the other linear systems are projected onto this Krylovsubspace. The process is repeated until convergence; see [12,1,9,13,14].

In the present paper, we give a new method for solving the problem (2). Our iterative method will be definedas block least squares (LSQR) method [11]. Algorithm LSQR is based on the bidiagonalization procedure ofGolub and Kahan [5]. It generates a sequence of approximations {xk} such that the residual norm krkk2

decreases monotonically, where rk = b � Axk. Analytically, the sequence {xk} is identical to the sequence gen-erated by the standard CG algorithm and by several other published algorithms.

We define the block bidiagonalization procedure. This procedure generates two sets of the n · s block vec-tors, V1,V2, . . . and U1,U2, . . . such that V T

i V j ¼ 0s, UTi U j ¼ 0s, i 5 j, and V T

i V i ¼ Is, UTi U i ¼ Is. We derive

two simple recurrence formulas for generating the sequences of approximations {Xk} such that themaxjkcolj(Rk)k2, where colj(Rk) represents the jth column of Rk, or the Frobenius norm of Rk decreases mono-tonically, where Rk = B � AXk.

Throughout this paper, we use the following notations. For two n · s matrices X and Y, we define the fol-lowing inner product: hX,YiF = tr(XTY), where tr(Z) denotes the trace of the square matrix Z. The associatednorm is the Frobenius norm denoted by kÆkF. We will use the notation hÆi2 for the usual inner product in Rn

and the associated norm denoted by kÆk2. Finally, 0s and Is will denote the zero and the identity matrices inRs�s.

The outline of this paper is as follows. In Section 2 we give a quick overview of LSQR method and its prop-erties. In Section 3 we present the block version of the LSQR algorithm and two different convergence criteria.In Section 4 some numerical experiments on test matrices from Harwell–Boeing collection are presented toshow the efficiency of the method. Finally, we make some concluding remarks in Section 5.

2. The LSQR algorithm

In this section, we recall some fundamental properties of LSQR algorithm [11], which is an iterative methodfor solving real linear systems of the form

Ax ¼ b;

where A is a nonsymmetric matrix of order n and x, b 2 Rn.LSQR algorithm uses an algorithm of Golub and Kahan [5], which stated as procedure Bidiag 1 in [11], to

reduce A to the lower diagonal form. The procedure Bidiag 1 can be described as follows.

Bidiag 1 (Starting vector b; reduction to lower bidiagonal form)

b1u1 ¼ b; a1v1 ¼ ATu1;

biþ1uiþ1 ¼ Avi � aiui;

aiþ1viþ1 ¼ ATuiþ1 � biþ1vi;

)i ¼ 1; 2; . . . . ð3Þ

The scalars ai P 0 and bi P 0 are chosen so that kuik2 = kvik2 = 1. With the definitions

U k � ½u1; u2; . . . ; uk�; V k � ½v1; v2; . . . ; vk�; Bk �

a1

b2 a2

. .. . .

.

bk ak

bkþ1

266666664

377777775;

854 S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862

the recurrence relations (3) may be rewritten as

Ukþ1ðb1e1Þ ¼ b;

AV k ¼ Ukþ1Bk;

ATUkþ1 ¼ V kBTk þ akþ1vkþ1eT

kþ1.

In exact arithmetic, we have U Tkþ1Ukþ1 ¼ I and V T

k V k ¼ I , where I is the identity matrix. By using the proce-dure Bidiag 1 the LSQR method constructs an approximation solution of the form xk = Vkyk which solves theleast-squares problem, minkb � Axk2. The main steps of the LSQR algorithm can be summarized as follows.

Algorithm 1 (LSQR algorithm)

Set x0 = 0Compute b1 = kbk2, u1 = b/b1, a1 = kATu1k2, v1 = ATu1/a1

Set w1 = v1, �/1 ¼ b1, �q1 ¼ a1

For i = 1,2, . . ., until convergence Do:

w = Avi � aiui

bi+1 = kwk2

ui+1 = w/bi+1

zi = ATui+1 � bi+1vi

ai+1 = kzik2

vi+1 = zi/ai+1

qi ¼ ð�q2i þ b2

iþ1Þ1=2

ci ¼ �qi=qi

si = bi+1/qi

hi+1 = siai+1

�qiþ1 ¼ �ciaiþ1

/i ¼ ci�/i

�/iþ1 ¼ si�/i

xi = xi�1 + (/i/qi)wi

wi+1 = vi+1 � (hi+1/qi)wi

(Test of convergence rate)EndDo.

More details about the LSQR algorithm can be found in [11].

3. The block least square method

In this section, we propose a new method for solving the linear equation (2). This method is based on theblock LSQR method. We first introduce a new procedure, based on Bidiag 1, for reducing A to the block lowerbidiagonal form.

The block Bidiag 1 procedure constructs the sets of the n · s block vectors V1,V2, . . . and U1,U2, . . . suchthat V T

i V j ¼ 0s, UTi U j ¼ 0s; for i 5 j, and V T

i V i ¼ Is, UTi U i ¼ Is; and they form the orthonormal basis of Rn�ks.

Block Bidiag 1 (Starting matrix B; reduction to block lower bidiagonal form).

U 1B1 ¼ B; V 1A1 ¼ ATU 1;

U iþ1Biþ1 ¼ AV i � U iATi ;

V iþ1Aiþ1 ¼ ATUiþ1 � V iBTiþ1;

)i ¼ 1; 2; . . . ; k; ð4Þ

where Ui; V i 2 Rn�s; Bi;Ai 2 Rs�s, and U1B1, V1A1, Ui+1Bi+1, Vi+1Ai+1 are the QR decompositions of thematrices B, ATU 1, AV i � U iA

Ti , ATU iþ1 � V iBT

iþ1, respectively. With the definitions

S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862 855

U k � ½U 1;U 2; . . . ;U k�; V k � ½V 1; V 2; . . . ; V k�; T k �

AT1

B2 AT2

. .. . .

.

Bk ATk

Bkþ1

266666664

377777775;

the recurrence relations (4) may be rewritten as:

U kþ1E1B1 ¼ B;

AV k ¼ U kþ1T k;

ATU kþ1 ¼ V kT Tk þ V kþ1Akþ1ET

kþ1;

where Ei is the n · s matrix which is zero except for the ith s rows, which are the s · s identity matrix. we havealso V

T

k V k ¼ Iks and UT

kþ1U kþ1 ¼ I ðkþ1Þs, where Il is the l · l identity matrix.

At iteration k we seek an approximate solution Xk of the form

X k ¼ V kY k; ð5Þ

where Yk is an ks · s matrix. The residual matrix for this approximate solution is given by

Rk ¼ B� AX k ¼ B� AV kY k ¼ U 1B1 � Ukþ1T kY k ¼ U kþ1ðE1B1 � T kY kÞ; ð6Þ

where E1 is the (k + 1)s · s matrix, which is zero except for the first s rows, which are the s · s identity matrix.

Now we propose two algorithms for computing the approximate solution to (2). The new algorithms arenamed block LSQR 1 and 2 (BL-LSQR 1, 2) since they are the generalizations of the single right-hand sideLSQR algorithm. In the block LSQR 1 algorithm we would like to choose Y k 2 Rks�s such that kcolj(Rk)k2

is a minimum independent for j = 1,2, . . . , s. In the block LSQR 2 algorithm, the matrix Yk is selected byimposing a minimization of the Frobenius norm of the residual, kRkkF.

In Section 3.1, we will show how to solve these problems.

3.1. The block LSQR 1(BL-LSQR 1) algorithm

The block LSQR 1 algorithm chooses Yk so that

kcoljðRkÞk2 ¼ kU kþ1ðE1B1 � T kY kÞejk2

is a minimum independent for j = 1,2, . . . , s. Since the columns of the matrix U kþ1 are orthonormal it followsthat

kcoljðRkÞk2 ¼ kðE1B1 � T kY kÞejk2. ð7Þ

This minimization problem is accomplished by using the QR decomposition [6], where a unitary matrix Qk isdetermined so that

QkT k ¼Rk

0

" #¼

q1 h2

q2 h3

. .. . .

.

qk�1 hk

qk

0

266666666664

377777777775;

856 S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862

where ql and hl are the s · s matrices. The matrix Qk is updated from the previous iteration by setting

Qk ¼I ðk�1Þs 0

0 Qðak; bk; ck; dkÞ

� �Qk�1 0

0 Is

� �;

where

Qðak; bk; ck; dkÞ ¼ak bk

ck dk

� �

is an 2s · 2s unitary matrix written as four s · s blocks, and Is is the s · s identity matrix. The unitary matrixQ(ak,bk,ck,dk) is computed such that

Qðak; bk; ck; dkÞ�qk 0

Bkþ1 ATkþ1

� �¼

qk hk

0 �qkþ1

� �.

A problem equivalent to minimizing (7) may then be written as minimizing

coljRk

0

" #Y k � QkE1B1

!����������

2

for j = 1,2, . . . , s. This is easily done by defining

QkE1B1 ¼F k

�/kþ1

� �; with F k ¼ ½/1/2 . . . /k�

T;

and setting

Y k ¼ R�1

k F k.

So the approximate solution is given by

X k ¼ V kR�1

k F k.

Letting

P k � V kR�1

k � ½P 1P 2 � � � P k�

then

X k ¼ P kF k.

The n · s matrix Pk, the last block column of P k; can be computed from the previous Pk�1 and Vk, by thesimple update

P k ¼ ðV k � P k�1hkÞq�1k . ð8Þ

Also note that,

F k ¼F k�1

/k

� �

in which

/k ¼ ak�/k.

Thus, Xk can be updated at each step, via the relation

X k ¼ X k�1 þ P k/k.

The equation residual RESk = maxjkcolj(AXk � B)k2 is computed directly from the quantity �/kþ1 as

RESk ¼ maxjkcoljð�/kþ1Þk2.

Some of the work in (8) can be eliminated by using matrices Wk � Pkqk in place of Pk. The main steps ofBL-LSQR 1 algorithm can be summarized as follows.

S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862 857

Algorithm 2 (BL-LSQR 1)

Set X0 = 0U1B1 = B, V1A1 = ATU1, (QR decomposition of B and ATU1)Set W1 = V1, �/1 ¼ B1, �q1 ¼ A1

For i = 1,2, . . . until convergence Do:

W i ¼ AV i � UiA

Ti

U iþ1Biþ1 ¼ W i, (QR decomposition of W i)Si ¼ ATU iþ1 � V iBT

iþ1

V iþ1Aiþ1 ¼ Si, (QR decomposition of Si)Compute a unitary matrix Q(ai,bi,ci,di) such that

Qðai; bi; ci; diÞ�qi

Biþ1

� �¼ qi

0

� �hiþ1 ¼ biA

Tiþ1

�qiþ1 ¼ diATiþ1

/i ¼ ai�/i

�/iþ1 ¼ ci�/i

P i ¼ W iq�1i

Xi = Xi�1 + Pi/i

Wi+1 = Vi+1 � Pihi+1

If maxjkcoljð�/iþ1Þk2 is small enough then stopEndDo.

The algorithm is a generalization of the LSQR algorithm. It reduces to the classical LSQR when s = 1. The

BL-LSQR 1 algorithm will be breakdown at step i, if qi is singular. This happens when the matrix�qi

Biþ1

� �is

not full rank. So the BL-LSQR 1 algorithm will not breakdown at step i, if Bi+1 is nonsingular.We will not treat the problem of breakdown in this paper and we assume that the matrices Bi’s produced by

the BL-LSQR 1 algorithm are nonsingular.

3.2. The block LSQR 2(BL-LSQR 2) algorithm

The block LSQR 2 algorithm chooses the matrix Yk which minimizes the Frobenius norm of the residualRk. From (6), the residual Rk can be written as

Rk ¼ U kþ1

eE1B1 � eT kY k

ETkþ1T kY k

" #;

where eT k (respectively, eE1) is the matrix obtained from Tk (respectively, E1) by deleting its last block row andEk+1 is the (k + 1)s · s matrix, which is zero except for the last s rows, which are the s · s identity matrix. Butsince the columns of the matrix U kþ1 are orthonormal it follows that

kRkk2F ¼

eE1B1 � eT kY k

ETkþ1T kY k

" #����������

2

F

¼ k � eE1B1 þ eT kY kk2F þ kET

kþ1T kY kk2F. ð9Þ

We now define the linear operators uk and wk as follows:For Y 2 Rks�s

ukðY Þ ¼ eT kY

and

wkðY Þ ¼ ETkþ1T kY .

858 S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862

Then the relation (9) can be expressed as

kRkk2F ¼ kukðY kÞ � eE1B1k2

F þ kwkðY kÞk2F. ð10Þ

Therefore, Yk minimizes the Frobenius norm of the residual if and only if it satisfies the following linear matrixequation.

uTk ðukðY kÞ � eE1B1Þ þ wT

k ðwkðY kÞÞ ¼ 0; ð11Þ

where the linear operators uT

k and wTk are the transpose of the operators uk and wk, respectively. Therefore,

(11) is also written as

eT T

k ðeT kY k � eE1B1Þ þ T Tk Ekþ1ðET

kþ1T kY kÞ ¼ 0.

Hence, Yk is given by

Y k ¼ bT �1

k F ;

where

bT k ¼ eT T

keT k þ T T

k Ekþ1ETkþ1T k; F ¼ eT T

keE1B1 ¼ ½A1B10 � � � 0�T.

The matrix bT k is a symmetric positive definite block tridiagonal matrix of the form

bT k ¼

eA1eBT

2

eB2eA2

. ..

. .. . .

. eBT

keBkeAk

2666666664

3777777775;

where eAi ¼ AiATi þ BT

iþ1Biþ1, for i = 1,2, . . . ,k and eBi ¼ AiBi, for i = 2,3, . . . ,k.The approximate solution of the system (2) is given by

X k ¼ V KbT �1

k F .

Suppose that, by using the QR decomposition [6], we obtain a unitary matrix bQk such that

bQkbT k ¼

a1 b2 h3

a2 b3 h4

. .. . .

. . ..

ak�2 bk�2 hk

ak�1 bk

ak

26666666666664

37777777777775� bRk;

where bRk is upper triangular as shown and ai, bi, hi are the s · s matrices. So, by setting

P k ¼ V kbT �1

kbQT

k � ½P 1P 2 . . . P k�

and

bF k ¼ QkF ¼ ½u1u2 . . . uk�T;

we have

P k ¼ ðV k � P k�2hk � P k�1bkÞa�1k ;

X k ¼ X k�1 þ P iuk.ð12Þ

S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862 859

From (12) the residual Rk is given by

Rk ¼ Rk�1 � AP kuk;

where APk can be computed from the previous APk’s and AVk by the simple update

AP k ¼ ðAV k � AP k�2hk � AP k�1bkÞa�1k .

Now we can summarize the above descriptions as the following algorithm.

Algorithm 3 (BL-LSQR 2)

Set X0 = 0Set c0 = a0 = d0 = Is, b0 = b�1 = d�1 = 0s, ~p0 ¼ ~p�1 ¼ 0n�s

Compute U1B1 = B, V1A1 = ATU1, (QR decomposition of B and ATU1)Set �u1 ¼ A1B1, eB1 ¼ 0s�s

For i = 1,2, . . . until convergence Do:

W i ¼ AV i � UiA

Ti

U iþ1Biþ1 ¼ W i (QR decomposition of W i)eAi ¼ AiATi þ BT

iþ1Biþ1

Si ¼ ATU iþ1 � V iBTiþ1

V iþ1Aiþ1 ¼ Si (QR decomposition of Si)

bi ¼ ai�1di�2eBT

i þ bi�1eAi

hi ¼ bi�2eBT

i�ai ¼ ci�1di�2

eBT

i þ di�1eAieBiþ1 ¼ Aiþ1Biþ1 Compute the unitary matrix Q(ai,bi,ci,di) such that

ai bi

ci di

� ��aieBiþ1

� �¼ ai

0

� �ui ¼ ai �ui

�uiþ1 ¼ ci �ui

P i ¼ ðV i � P i�2hi � P i�1biÞai�1

Xi = Xi�1 + Piui

Ri = Ri�1 � APiui

If maxjkcolj(Ri)k2 is small enough then stopEndDo

The BL-LSQR 2 algorithm will be breakdown at step i, if ai is singular. This happens when the matrixaieBiþ1

� �is not full rank. So the BL-LSQR 2 algorithm will not breakdown at step i, if eBiþ1 is nonsingular. As

we mentioned in Section 3.1, we will not treat the problem of breakdown in this paper and we also assume thatthe matrices eBi’s produced by the BL-LSQR 2 algorithm are nonsingular.

The following proposition establishes a relation between the residual matrices associated with the approx-imate solutions obtained by the BL-LSQR 1 and 2 algorithms.

Proposition 1. Let Xk and X 0k be the approximate solutions of (2) which obtained by BL-LSQR 1 and BL-LSQR

2 algorithm, respectively. Then we have

kRkk2F 6 kR0kk

2F;

where Rk and R0k are the residual matrices associated with the approximate solutions Xk and X 0k, respectively.

Proof. By noting that BL-LSQR 1 algorithm chooses Y k 2 Rks�s such that kcolj(Rk)k2 is a minimum indepen-dent for j = 1,2, . . . , s and BL-LSQR 2 algorithm selects Y k 2 Rks�s such that kRkkF is a minimum; from (6), (7)and (9) we have

TableTest p

Matrix

PDE90PDE29SHERSHERGR-30

TableEffecti

Matrix

A1

PDE90PDE29SHERGR-30

TableEffecti

Matrix

A1

PDE90PDE29SHERGR-30

860 S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862

Xs

j¼1

minykj2Rks

kðE1B1ej � T kykjÞk22 6 min

Y k2Rks�skE1B1 � T kY kk2

F;

which completes the proof. h

4. Numerical examples

In this section, we give some experimental results. Our examples have been coded in Matlab with doubleprecision and have been executed on a PIV/1.8 GHZ/Full workstation. For all the experiments, the initialguess was X0 = 0 and B = r and (n, s), where function rand creates an n · s random matrix with coefficientsuniformly distributed in [0,1]. In all test problems, we have not used any preconditioning. All the tests werestopped as soon as, max16j6s(kcolj(Rk)k2/kcolj(R0)k2) 6 10�7.

As [7], the first matrix test A1 represents the 5-point discretization of the operator

LðuÞ ¼ �uxx � uyy þ dux ð13Þ

on the unit square [0,1] · [0, 1] with homogeneous Dirichlet boundary conditions. The discretization was per-formed using a grid size of h = 1/61 which yields a matrix of dimension n = 3600; we chose d = 0.5.

Also, we use some matrices from Harwell–Boeing collection. These matrices with their properties are shownin Table 1.

In Table 2 (respectively, Table 3), we give the ratio t(s)/t(1), where t(s) is the CPU time for BL-LSQR 1(respectively, BL-LSQR 2) algorithm and t(1) is the CPU time obtained when applying LSQR for oneright-hand side linear system. Note that the time obtained by LSQR for one right-hand side depends on whichright-hand was used. So, in our experiments, t(1) is the average of the times needed for the s right-hand sides

1roblems information

nproperty Order sym nnz cond

0 900 No 4380 2.9e+0261 2961 No 14,585 9.49e+02

MAN4 1104 No 3786 7.2e+03MAN5 3312 No 20,793 3.9e+05-30 900 Yes 4322 35.8e+02

2veness of BL-LSQR 1 algorithm measured by t(s)/t(1)

nS 5 10 15 20 30

1.52 1.82 2.02 2.60 4.020 2.05 2.51 2.82 3.17 3.6661 1.66 2.10 2.33 2.55 4.02

MAN4 1.34 1.21 1.33 1.40 1.64-30 2.05 2.29 2.70 3.02 3.93

3veness of BL-LSQR 2 algorithm measured by t(s)/t(1)

nS 5 10 15 20 30

1.90 2.22 2.43 3.10 4.510 2.74 3.22 3.81 3.91 4.5061 2.50 3.06 3.34 3.70 4.89

MAN4 1.77 1.65 1.71 1.84 2.10-30 2.93 3.32 3.82 4.37 5.61

Table 4Number of iterations to convergence for the test problem SHERAMAN5

s 1 5 10 15 20 30

BL-LSQR 1 � � 1913 995 629 326BL-LSQR 2 � � 1921 1004 630 329

S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862 861

using LSQR. We note that BL-LSQR 1 (or 2) is effective if the indicator t(s)/t(1) is less than s. In Table 2(respectively, Table 3), we list the ratio t(s)/t(1), for s = 5, 10, 15, 20, and 30, for the BL-LSQR 1 (respectively,BL-LSQR 2). As shown in Tables 2 and 3 the BL-LSQR 1 and 2 algorithms are effective and less expensivethan the LSQR algorithm applied to each right-hand side. In addition, the BL-LSQR 1 algorithm is moreeffective than the BL-LSQR 2 algorithm which is in good agreement with Proposition 1.

For test problem SHERMAN5, we present in Table 4, the number of iterations to convergence of BL-LSQR 1 and 2 algorithms for s = 5, 10, 15, 20, and 30. In this table, s = 1 corresponds to the standard LSQRmethod. For this test problem the stopping criterion

max16j6sðkcoljðRkÞk2=kcoljðR0Þk2Þ 6 10�5

was used and the maximum number of 2000 and 20,000 iterations was allowed for BL-LSQR 1 (or 2) andLSQR algorithm, respectively. A dagger signifies that the maximum allowed number of iteration was reachedbefore convergence. We observed that the convergence of LSQR deteriorates significantly and the residualonly satisfies krkk2/kr0k2 6 1.5 · 10�2, even after 20,000 iterations. As can be seen from Table 4, The BL-LSQR 1 and 2 algorithms are able to obtain the solution with desired accuracy.

5. Conclusion

We have proposed in this paper two new block LSQR algorithms for nonsymmetric linear systems withmultiple right-hand sides. To define these new algorithms we use the procedure which generate two sets ofthe orthonormal block vectors and we derive two simple recurrence formulas for generating the sequencesof approximations {Xk} such that maxjkcolj(Rk)k2 or the Frobenius norm of residual Rk decreases monoton-ically. Experimental results show that the proposed methods are effective and less expensive than the LSQRalgorithm applied to each right-hand side. From Proposition 1 and the experimental results we can concludethat the BL-LSQR 1 is better than BL-LSQR 2.

References

[1] T. Chan, W. Wang, Analysis of projection methods for solving linear systems with multiple right-hand sides, SIAM J. Sci. Comput. 18(1997) 1698–1721.

[2] A. EL Guennouni, K. Jbilou, H. Sadok, The block Lanczos method for linear systems with multiple right-hand sides, ano.univ-lille1.fr/pub/1999/ano 396.html-2k.

[3] A. EL Guennouni, K. Jgilon, H. Sadok, A block BiCGSTAB algorithm for multiple linear systems, Electron. Trans. Numer. Anal. 16(2003) 129–142.

[4] R. Freund, M. Malhotra, A block-QMR algorithm for non-Hermitian linear systems with multiple right-hand sides, Linear AlgebraAppl. 254 (1997) 119–157.

[5] G.H. Golub, W. Kahan, Algorithm LSQR is based on the Lanczos process and bidiagonalization procedure, SIAM J. Numer. Anal. 2(1965) 205–224.

[6] G.H. Golub, C.F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, MD, 1983.[7] K. Jbilou, A. Messaoudi, H. Sadok, Global FOM and GMRES algorithms for matrix equations, Appl. Numer. Math. 31 (1) (1999)

49–63.[8] K. Jbilou, H. Sadok, Global Lanczos-based methods with applications, Technical Report LMA 42, Universite du Littora, Calais,

France, 1997.[9] P. Joly, Resolution de Systems Lineaires Avec Plusieurs Second Members par la Methode du Gradient Conjugue, Tech. Rep.

R-91012, Publications du Laboratire d’Analyse Numerique, Universite Pierre et Marie Curie, Paris, 1991.[10] D. O’Leary, The block conjugate gradient algorithm and related methods, Linear Algebra Appl. 29 (1980) 293–322.[11] C.C. Paige, M.A. Saunders, LSQR: an algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Software 8

(1) (1982) 43–71.

862 S. Karimi, F. Toutounian / Applied Mathematics and Computation 177 (2006) 852–862

[12] Y. Saad, On the Lanczos method for solving symmetric linear systems with several right-hand sides, Math. Comput. 48 (1987) 651–662.

[13] V. Simoncini, E. Gallopolous, An iterative method for nonsymmetric systems with multiple right-hand sides, SIAM J. Sci. Comput.16 (1995) 917–933.

[14] H. Van Der Vorst, An iterative solution method for solving f(A) = b, using Krylov subspace information obtained for the symmetricpositive definite matrix A, J. Comput. Appl. 18 (1987) 249–263.

[15] B. Vital, Etude de Quelques Methodes de Resolution de Problemes Lineares de Grande Taille sur Multiprocesseur, Ph.D. thesis,Universite de Rennes, Rennes, France, 1990.