An Affine Scaling Method with an Infeasible Starting Point: Convergence Analysis under Nondegeneracy...

An Affine Scaling Method with an

Infeasible Starting Point:

Convergence Analysis under

Nondegeneracy Assumption

Masakazu MURAMATSUy and Takashi TSUCHIYAz

y Department of Mechanical Engineering

Sophia University

zThe Institute of Statistical Mathematics

Abstract

In this paper, we propose an infeasible-interior-point algorithm for linear pro-

gramming based on the a�ne scaling algorithm by Dikin. The search direction of

the algorithm is composed of two directions, one for satisfying feasibility and the

other for aiming at optimality. Both directions are a�ne scaling directions of cer-

tain linear programming problems. Global convergence of the algorithm is proved

under a reasonable nondegeneracy assumption. A summary of analogous global

convergence results for general cases obtained in [17] by means of the local potential

function is also given.

Key Words: Linear Programming, Infeasible-Interior-Point Methods, A�ne Scaling Al-

gorithm, Global Convergence Analysis, Nondegeneracy Assumption

Address: y7-1, Kioi-cho, Chiyoda-ku, Tokyo 102 Japan, z4-6-7 Minami-Azabu, Minato-

ku, Tokyo 106 Japan

E-mail address: ymuramatu@ho�man.cc.sophia.ac.jp, [email protected]

1 Introduction

In this paper, we propose an infeasible-interior-point algorithm for linear programming

based on the a�ne scaling algorithm, and prove its global convergence under a reasonable

nondegeneracy assumption.

The a�ne scaling algorithm is the �rst interior point algorithm in the world proposed

by Dikin in 1967, and is known as one of the simplest and most e�cient interior point

algorithms. It was rediscovered by several researchers including Barnes [5] and Vanderbei

et al. [26] after Karmarkar [12] proposed his projective scaling algorithm. The algorithm

is also known as the �rst interior point algorithm which turned out to be competitive with

the simplex algorithm for a fairly large scale problems [1, 2]. Though the polynomiality

status of the algorithm is not yet known, there have been a number of papers studying

global and/or local convergence properties of the algorithm and its continuous version

under various assumptions on nondegeneracy and step-size choice [3, 5, 6, 7, 8, 9, 10, 11,

14, 21, 22, 23, 24, 25, 26, 27]. These researches gradually revealed deep and beautiful

mathematical structure of this algorithm, and now it is considered that the analysis of

the a�ne scaling algorithm is important not only because of practical reason but also

because of theoretical interest itself.

On the other hand, initialization of an interior point algorithm has been an important

issue in the research of the interior point algorithms. It has been typical approaches

to solve Phase-I problem to �nd a feasible solution or to resort to so-called the Big-M

method. To �nd a better resolution, recently, more e�orts are made to develop infeasible-

interior-point algorithms which can start from any interior point of the positive orthant

[4, 13, 15, 20, 28]. The advantage of this type of algorithms is that we do not need to

solve the problem twice nor need to conventionally introduce a big number which can be a

potential cause of numerical instability. Therefore, it is an interesting subject to consider

an infeasible version of the a�ne scaling algorithm.

The search direction of our algorithm is composed of two directions, one for satisfy-

ing feasibility and the other for aiming at optimality. Both directions are a�ne scaling

directions of certain linear programming problems. Furthermore, combination of the two

directions is chosen so that the search direction has the nice scaling invariant property,

thus our method is a natural extension of the original a�ne scaling method. In fact, if an

iterate happens to be feasible, our direction at the point becomes identical to the original

a�ne scaling direction. The idea of combining the same two directions is also seen in an

earlier book by Dikin and Zolkalcev [9] without any convergence analysis. This method is

also regarded as a kind of FIGAP method for nonlinear programming proposed by Tanabe

[19].

This paper is a simpli�ed version of the paper [17], where the same algorithm is

proposed and its global convergence is proved with only assuming existence of an optimal

solution and without assuming any nondegeneracy condition nor existence of interior

feasible solution. Convergence analysis of the a�ne scaling algorithm becomes more

di�cult in the presense of degeneracy. Most of the global convergence proofs of the

(feasible) a�ne scaling algorithm without any nondegeneracy assumption such as [22, 23]

(for a short-step version) and [8, 25] (for a long-step version) is based on a local potential

function introduced by Tsuchiya [22] (See [16] and [18] for self-contained and simpli�ed

1

proofs of these results). In [17], the analysis was carried out by means of the local potential

function to obtain convergence results without nondegeneracy assumptions. However, this

approach tends to make the analysis complicated and quite involved.

Making the nondegeneracy assumption, in this paper we avoid the use of the local

potential function and prove global convergence of the algorithm under a weaker condition

on step-size choice, which will ensure more e�ciency in practice. We hope and believe that

this paper gives a good introduction and motivation for the readers to read the original

paper [17] which deals with general cases.

This paper is organized as follows. In Section 2, we introduce the algorithm and

explain the convergence results obtained in [17]. In Section 3, we introduce the nonde-

generacy assumption and make preliminary observations. In Section 4, we deal with the

case where the primal problem is not feasible. In Section 5, we prove global convergence

of the algorithm when primal and dual are feasible. In Section 6, we prove that a pair

of primal and dual optimal solutions is obtained which satis�es the strict complementar-

ity condition. Finally, in Section 7, we make a complete description of the asymptotic

behavior of the algorithm on the basis of the results obtained in the previous sections.

We introduce the basic notations used in this paper. The notation A

4

= B means that

A is de�ned by B. We denote a vector which has all 1 component by e. The dimension

of e should be understood by context. If a set A is a (proper) subset of B, we write

A(�)� B or B(�)� A. Let J be an index set of f1; : : : ; ng. We use the following

notations concerning vector, diagonal matrix, and m� n matrix.

1. If h is a vector, then h

J

is a sub-vector of h composed of components corresponding

to J .

2. If D is a diagonal matrix, then D

J

is a jJ j � jJ j diagonal sub-matrix of D whose

diagonal components are corresponding to J .

3. If M is a m � n matrix, then M

J

is a sub-matrix of M whose column vectors are

corresponding to J .

2 The Algorithm and Convergence Results

Let us consider the standard form linear programming problem

hP i

(

min c

t

x

subject to Ax = b; x � 0;

(1)

where c; x 2 R

n

; b 2 R

m

and A 2 R

m�n

, and its dual

hDP i

(

max b

t

y

subject to A

t

y � c:

(2)

We assume that Rank(A) = m. Since we deal with an infeasible-interior-point algorithm,

we do not assume anything about the feasibility of hP i nor hDi. We have four possibilities

[C1]{[C4] about feasibilities of primal and dual;

2

[C1] Both primal and dual are feasible (in this case, optimal solutions exist for both

problems due to the duality theory.)

[C2] Primal is infeasible but dual is feasible.

[C3] Primal is feasible but dual is infeasible.

[C4] Both primal and dual are infeasible.

We consider an algorithm which allows any positive point x

0

> 0 as a starting point.

We assume that x

0

is infeasible for hP i, because otherwise we have no need to use

infeasible-interior-point algorithm.

We introduce the following linear programming problem hF i determined by the initial

point x

0

to �nd a feasible solution of hP i:

hF i

(

min

(x;w)

w

subject to Ax� wr = b; x � 0;

(3)

where w 2 R and r

4

= Ax

0

� b is the initial residual. Note that w is a free variable. The

dual of hF i can be written as

hDF i

(

min b

t

y

subject to A

t

y � 0; r

t

y = 1:

(4)

(Note that the sign of the free variable y is taken to be opposite to the usual convention.)

Obviously, (x;w) = (x

0

; 1) is an interior feasible solution of hF i. Let w

�

be the optimal

value of hF i. Then the following proposition holds due to strict complementarity.

Proposition 2.1 1. If w

�

> 0, then FeashP i

4

= fx jAx = b; x � 0 g is empty.

2. If w

�

= 0, then FeashP i is nonempty but Int FeashP i

4

= fx 2 FeashP i jx > 0 g is

empty.

3. If w

�

< 0, then Int FeashP i is nonempty.

The following proposition shows that the objective function w of hF i is a function of

x.

Proposition 2.2 There exists a vector g 2 R

n

and a real w such that

w � w = g

t

x (5)

for all (x;w) 2 FeashF i.

Proof : Let x be a solution of Ax = b (such x exists because Rank(A) = m.) Then, we

have A(x � x) = wr and this implies that r

t

A(x � x) = wkrk

2

. From this relation, it is

readily seen that (recall that since x

0

is infeasible for hP i, r 6= 0.)

g =

A

t

r

krk

2

; w = �

r

t

b

krk

2

(6)

3

satisfy (5). 2

Therefore, we can eliminate w from hF i to obtain the following equivalent standard form

linear programming problem:

(

min

x

g

t

x

subject to (A� rg

t

)x = b; x � 0:

(7)

Note that Rank(A� rg

t

) = m� 1 since

A� rg

t

= (I �

rr

t

krk

2

)A: (8)

By discarding a redundant row of the system (A� rg

t

)x = b, we have a full rank matrix

~

A 2 R

(m�1)�n

and a vector

~

b 2 R

m�1

such that fx j (A � rg

t

)x = b g = fx j

~

Ax =

~

b g.

Now hF i is equivalent to

(

min

x

g

t

x

subject to

~

Ax =

~

b; x � 0:

(9)

Though we can eliminate w from hF i to obtain (9), still the use of w in (3) simpli�es our

discussion. Therefore, we mainly use the expression (3) in this paper. In some cases when

we deal with the dual estimate, however, it is much easier to discuss with the standard

form (9) whose coe�cient matrix

~

A is full rank, thus we use the expression (9) in that

case.

Now we introduce the search direction of the algorithm. As was mentioned in the

introduction, the search direction is a mixture of the a�ne scaling directions for two linear

programming problems aiming at feasibility and optimality. See, e.g. [6, 5, 16, 27, 26], for

motivation and detailed explanation of the a�ne scaling direction.

The feasibility direction is de�ned as the a�ne scaling direction for hF i. The dual

estimate for hF i is written as

z(x)

4

= argmin

z=A

t

y;r

t

y=1

kXzk = argmin

z=g�

~

A

t

~y

kXzk (10)

where X = diag(x) and x 2 FeashF i. Now X

2

z(x) is the a�ne scaling direction for hF i.

In terms of the projection matrix P

~

AX

4

= X

~

A

t

(

~

AX

2

~

A

t

)

�1

~

AX, the dual estimate z(x) and

the search direction X

2

z(x) is written as follows:

z(x) = g �

~

A

t

(

~

AX

2

~

A

t

)

�1

~

AX

2

g = X

�1

(I � P

~

AX

)Xg: (11)

X

2

z(x) = X(I � P

~

AX

)Xg: (12)

From this formula, we readily see that the a�ne scaling direction is a descendent direction

of the objective function w = g

t

x + w, and hence we use this to improve feasibility. It

is easily seen that the associated displacement of w is g

t

X

2

z(x) = kXz(x)k

2

. Thus,

(X

2

z(x); kXz(x)k

2

) is the displacement of the feasibility direction in the space of (x;w).

4

Next we introduce the optimality direction. For �xed w, we consider the following

linear programming problem modi�ed from hP i:

hP (w)i

(

min c

t

x

subject to Ax = b+ wr; x � 0;

(13)

and the dual estimate

s(x)

4

= argmin

s=c�A

t

y

kXsk: (14)

The direction X

2

s(x) is the a�ne scaling direction for hP (w)i. As the objective function

of hP (w)i is c

t

x, we expect that this direction reduces c

t

x, thus we use it to improve op-

timality. In terms of the projection matrix P

AX

= XA

t

(AX

2

A

t

)

�1

AX, the dual estimate

s(x) and the search direction X

2

s(x) is written as follows:

s(x) = c� A

t

(AX

2

A

t

)

�1

AX

2

c = X

�1

(I � P

AX

)Xc: (15)

X

2

s(x) = X(I � P

AX

)Xc: (16)

In this case, displacement of w is zero, obviously. Thus (X

2

s(x); 0) is the displacement

vector of the optimality direction in the space of (x; w).

By combining these two a�ne scaling directions, we de�ne the search direction at

(x;w) 2 Int FeashF i by

�x

�w

!

=

0

B

@

wX

2

z(x)

kXz(x)k

2

w

1

C

A

+ �(x)

0

B

@

X

2

s(x)

kXs(x)k

0

1

C

A

=

0

B

@

�(x)

X

2

s(x)

kXs(x)k

+

wX

2

z(x)

kXz(x)k

2

w

1

C

A

(17)

where �x is x-component, �w is w-component, and �(x) is a parameter to determine

strength of the direction for optimality. A concrete form of �(x) is speci�ed later in

this section. Here the feasibility direction (X

2

z(x); kXz(x)

2

k) is multiplied by a factor of

w=kXz(x)k

2

. With this normalization, Ax = b is satis�ed exactly when taking a unit step,

i.e., we have A(x��x)�b = r(w��w) = 0. On the other hand, X

2

s(x)=kXs(x)k is the

displacement vector of the short step a�ne scaling algorithm (for hP (w)i) [5, 6, 7, 22, 23].

The algorithm starts from the initial point (x

0

; 1), and make a sequence of feasible

points of hF i. The generic form of the algorithm is as follows.

Initialize x

0

> 0; w

0

= 1; k := 0;

while x

k

does not satisfy the stopping criteria do

begin

Compute �x

k

and the step-size �

k

;

x

k+1

:= x

k

� �

k

�x

k

;

w

k+1

:= w

k

� �

k

w

k

;

k := k + 1

end

To keep fx

k

g positive, the step-size �

k

should satisfy

�

k

< �((X

k

)

�1

�x

k

)

�1

(18)

5

where X

k

4

= diag(x

k

) and �(v)

4

= max

j

v

j

. If we choose �

k

= 1, then we have

Ax

k+1

= A(x

k

��x

k

) = b; (19)

which means that the next iterate satis�es the equality constraints of hP i. Hence if

�((X

k

)

�1

�x

k

) < 1, it makes sense to take step-size �

k

= 1 to obtain an interior feasible

solution of hP i.

When viewed as a function of x, the direction (17) is continuous at w(x) = 0, because

kX

2

z(x)k=kXz(x)k

2

is bounded by a constant due to Proposition 3.3 in the next section.

Once x

k

becomes feasible for hP i so that w

k

= 0, we have

�x

k

�w

k

!

=

0

B

@

�(x

k

)(X

k

)

2

s(x

k

)

kX

k

s(x

k

)k

0

1

C

A

; (20)

which means that �x

k

is nothing but the (feasible) a�ne scaling direction (recall that

hP (0)i is hP i) and that our algorithm reduces to the feasible a�ne scaling algorithm.

In consideration of the above observation, we adopt the following step-size choice.

Step Size Choice If x

k

is infeasible and �

k

4

= �((X

k

)

�1

�x

k

) < 1, then choose

�

k

= 1 (we �nd an interior feasible solution), else choose �

k

satisfying

�

min

� �

k

< 1; (21)

and let

�

k

= �

k

=�

k

(22)

where �

min

is a prede�ned positive constant.

The step-size choice (22) means to move the fraction �

k

of the way to the boundary of

the positive orthant.

If an iterate happens to be an interior feasible solution of hP i, then the algorithm

becomes identical to a long-step a�ne scaling algorithm [26] whose global convergence

was proved for any �

k

< 1 under nondegeneracy assumption (see, e.g., [7],[10],[16]) and

with �

k

� 2=3 without any kind of nondegeneracy assumption [25].

We propose two strategies on the choice of �:

1. �(x) is a constant less than 1.

2. �(x)

4

= �kX

k

s(x

k

)k=�(X

k

s(x

k

)), where � is a constant less than 1.

One of the important features of these strategies is that the resulting search direction

�x is scaling invariant. Note also that the latter choice of � is greater than the former.

Intuitively speaking, the �rst strategy means that, if x

k

is su�ciently close to the feasi-

ble region of hP i (then the optimality direction dominates the feasibility direction), the

algorithm is similar to the short-step a�ne scaling algorithm, while with the latter, the

algorithm is more similar to the long-step a�ne scaling algorithm. Thus, with the second

6

strategy, we expect that the iterates approach the optimal solution more quickly. In the

following, we use abbreviation s

k

and z

k

for s(x

k

) and z(x

k

).

In [17], Strategy 1 (�(x) is a constant less than 1) and �

k

� 1=2 is adopted, and global

convergence of the algorithm is proved without any kind of nondegeneracy assumption.

The convergence results proved in [17] are as follows.

Theorem 2.3 Let fx

k

g be a sequence generated by the algorithm with �(x

k

) = � 2 (0; 1)

and 0 < �

min

� �

k

� 1=2, where �

min

is a positive constant less than 1. Then the following

situation occurs depending on feasibilities of the primal problem and the dual problem.

1. If both primal and dual are feasible ([C1]), x

k

converges to a relative interior point

of the optimal face. In particular, if hP i has an interior feasible point, then x

k

becomes feasible in a �nite number of iterations.

2. If primal is infeasible and dual is feasible ([C2]), x

k

converges to an infeasible point

(More precisely, f(x

k

; w

k

)g converges to a relative interior point of the optimal face

of the linear programming problem (3) to �nd a feasible point of (1)).

3. If primal is feasible and dual is infeasible ([C3]), c

t

x

k

diverges to minus in�nity.

4. If both primal and dual are infeasible ([C4]), there exist two possibilities, namely,

x

k

converges to an infeasible point which is characterized in the analysis of [C2], or

c

t

x

k


Theorem 2.4 Under the same assumption as Theorem 2.3, the following holds for the

asymptotic behavior of the sequence fx

k

g and fs

k

g, fz

k

g.

1. If both primal and dual are feasible ([C1]) and primal has an interior feasible solu-

tion, then fs

k

g converges to the analytic center of the dual optimal face, so that x

1

and s

1

satisfy strict complementarity for hP i and hDP i.

2. If both primal and dual are feasible ([C1]) but primal does not have an interior

feasible solution, then fz

k

g converges to the analytic center of the optimal face of

hDF i, and for su�ciently large M , x

1

and every accumulation point of fs

k

+Mz

k

g

satisfy strict complementarity condition for hP i and hDP i.

3. If primal is infeasible and dual is feasible ([C2]), then fz

k

g converges to the analytic

center of the optimal face of hDF i.

In the consecutive sections, we will prove analogous convergence results under Strategy

2 with a suitable nondegeneracy assumption. Before proceeding, we point out that the

search direction of this algorithm constitutes a vector �eld de�ned over the positive orthant

completely determined only by A; b and c. This means that the algorithm can be viewed

as the method to follow the vector �eld which is continuous over the positive orthant (and

ends up with an optimal solution of hP i).

7

As was mentioned before, w is a function of x, and hence we can write (17) by a

function of x only eliminating w-component. A straightforward calculation show that the

x-part of the displacement vector �x is written as

�x = �(x)

X(I � P

AX

)Xc

k(I � P

AX

)Xck

+XP

AX

X

�1

(x� x); (23)

where x is any solution of Ax = b (cf. the proof of Lemma 5.5). This means that the

direction �x naturally de�nes a vector �eld over the interior of the positive orthant as

most of the directions of interior point algorithm do. Furthermore, we can even show

that the vector �eld �x=�(X

�1

�x) is continuous including the boundary of the positive

orthant. Thus, our algorithm is regarded as a method which follows discretely the vector

�eld which ows into an optimal solution of hP i.

3 The Nondegeneracy Assumption and Preliminary

Observations

Now we focus our attention on Strategy 2:

�(x) = �

kX

k

s

k

k

�(X

k

s

k

)

(24)

where 0 < � < 1 is a positive constant and 0 < �

min

� �

k

< 1 (�

min

is a positive constant

less than 1), and prove global convergence under the following nondegeneracy assumption.

Assumption 1 hF i is nondegenerate, i.e., at most n �m + 1 components of x can be

zero at any feasible solution of hF i.

Assumption 2 If hP i has an interior feasible solution, i.e., there exists an ~x > 0 such

that A~x = b, then hP i is also nondegenerate, i.e., at most n�m components of x can be

zero at any feasible solution of hP i.

These assumptions imply the following propositions:

Proposition 3.1 Under Assumption 1, z(x) is continuous over FeashF i including bound-

ary, or equivalently,

~

AX

2

~

A

t

is invertible for any feasible solution of hF i.

Proposition 3.2 Under Assumption 2, if hP i has an interior feasible solution, then s(x)

is continuous over FeashP i including boundary, or equivalently, AX

2

A

t

is invertible for

any feasible solution of hP i.

We will prove an analogous results to Theorems 2.3 and 2.4 for Strategy 2 under these

assumptions. We prove the theorems in the following steps.

1. If c

t

x

k

is bounded, then x

k

converges (Theorem 3.8).

8

2. If x

k

converges and w

k

does not converge to 0, then w

1

4

= lim

k!1

w

k

> 0 is the

optimal value of hF i, in which case, hP i is not feasible (Theorem 4.1).

3. If x

k

converges and Int FeashP i is not empty, then the algorithm �nds an interior

feasible solution in a �nite number of iterations (Theorem 5.1).

4. If x

k

converges to x

1

and FeashP i is nonempty but Int FeashP i is empty, then x

1

is a relative interior point of the optimal face of hP i (Theorems 5.4, 5.7 and 6.9).

The main results Theorems 7.1 and 7.2 are described in Section 7 by summarizing these

results.

We make some more conventional assumptions. If �(X

k

s

k

) � 0 occurs for some k,

it means that (X

k

)

2

s

k

is an in�nite direction of hP i, i.e., hDP i is not feasible. In this

case, hP i does not have an optimal solution even if it is feasible, thus we terminate the

algorithm. Therefore, we assume

�(X

k

s

k

) > 0 (25)

for all k in the analysis.

Also, if the algorithm �nds an interior feasible solution of hP i, then the algorithm

becomes identical to the long-step a�ne scaling algorithm. Then it is well-known that

Theorems 2.3 and 2.4 hold true (see, [7, 10, 16], e.g.). Hence we assume that fx

k

g is an

in�nite sequence of infeasible points. In other words, we assume

�((X

k

)

�1

�x

k

) � 1 (26)

for all k.

We summarize some of the basic properties of the a�ne scaling direction and the dual

estimate of hF i and hP i.

Proposition 3.3 (Theorem 2 of Tseng and Luo [21] and Theorem 2.6(a) of Monteiro et

al. [16].) There exist positive constants M

1

and M

2

such that for any x > 0,

kX

2

s(x)k �M

1

kXs(x)k

2

and kX

2

z(x)k � M

2

kXz(x)k

2

: (27)

Proposition 3.4 (Lemma of Vanderbei and Lagarias [27] and Proposition 2.8 of Mon-

teiro et al. [16].) For any x > 0, z(x) and s(x) are bounded.

Now we are ready to make some preliminary observations. Obviously, (x

k

; w

k

) gener-

ated by the algorithm is feasible for hF i, and w

k

is monotonically decreasing. If we succeed

in reducing w to 0, then we �nd a feasible solution of hP i. In this sense, w-component

expresses infeasibility. Since w

k

is bounded below, w

k

has a limit point w

1

� 0.

Lemma 3.5 There exists a constant � such that (w

k

� w

1

)=kX

k

z

k

k � � > 0.

Proof : Since

�

�

X

k

s

k

�(X

k

s

k

)

+

w

k

kX

k

z

k

k

2

X

k

z

k

!

� 1 (28)

9

(otherwise, we �nd a feasible point), we have

w

k

kX

k

z

k

k

�

w

k

kX

k

z

k

k

2

�(X

k

z

k

) � 1� � > 0; (29)

where we used the fact that �(v) + �(u) � �(v + u) for any vector v and u. Then the

lemma readily follows in the case where w

1

= 0.

Next we consider the case where w

1

> 0. Since

w

k+1

� w

1

w

k

� w

1

= 1�

�

k

w

k

w

k

� w

1

� 0; (30)

we have, by de�nition of �

k

, that

(w

k

� w

1

)�

�

X

k

s

k

�(X

k

s

k

)

+

w

k

X

k

z

k

kX

k

z

k

k

2

!

� �

k

w

k

� �

min

w

1

> 0: (31)

Let K be a number such that for all k � K, w

k

� w

1

� �

min

w

1

=(2� ). For k � K, we

have, in view of (31),

w

k

� w

1

kX

k

z

k

k

�

w

k

� w

1

kX

k

z

k

k

2

w

k

�(X

k

z

k

) � �

min

w

1

� (w

k

� w

1

)� �

�

min

w

1

2

: (32)

By letting �

1

4

= min

k<K

(w

k

� w

1

)=kX

k

z

k

k and � = min(�

1

; �

min

w

1

=2), the lemma also

holds in this case. 2

Corollary 3.6 We have kX

k

z

k

k ! 0.

The following lemma plays a crucial role in the proof of the convergence of the sequence

fx

k

g.

Lemma 3.7 There exist constants M

1

and M

2

such that, for all p,

1

X

k=p

kx

k+1

� x

k

k �M

1

1

X

k=p

(c

t

x

k

� c

t

x

k+1

) +M

2

(w

p

� w

1

): (33)

Proof : In view of (27) and c

t

(X

k

)

2

s

k

= kX

k

s

k

k

2

, there exists a constant M

1

such that

1

X

k=p

(c

t

x

k+1

� c

t

x

k

) = �

1

X

k=p

(

�

k

�

k

kX

k

s

k

k+ �

k

w

k

c

t

(X

k

)

2

z

k

kX

k

z

k

k

2

)

(34)

� �

1

X

k=p

�

k

�

k

kX

k

s

k

k+

1

X

k=p

�

k

w

k

kckM

1

: (35)

Thus noting that

P

1

k=p

�

k

w

k

=

P

1

k=p

(w

k

� w

k+1

) = w

p

� w

1

, we have

1

X

k=p

�

k

�

k

kX

k

s

k

k �

1

X

k=p

(c

t

x

k

� c

t

x

k+1

) + kckM

1

(w

p

� w

1

): (36)

10

This produces

1

X

k=p

kx

k+1

� x

k

k �

1

X

k=p

�

k

�

k

k(X

k

)

2

s

k

k

kX

k

s

k

k

+

1

X

k=p

�

k

w

k

k(X

k

)

2

z

k

k

kX

k

z

k

k

2

� M

2

1

X

k=p

�

k

�

k

kX

k

s

k

k+M

1

1

X

k=p

�

k

w

k

(Use Proposition 3.3)

� M

2

1

X

k=p

(c

t

x

k

� c

t

x

k+1

) + (kckM

1

M

2

+M

1

)(w

p

� w

1

); (37)

which completes the proof. 2

Theorem 3.8 If c

t

x

k

is bounded below, x

k

converges.

Proof : Since the righthand side of (33) is bounded under the assumption, the theorem

readily follows. 2

Corollary 3.9 We have

1

X

k=0

�

k

kX

k

s

k

k <1: (38)

Proof : Straightforward from (36), �

k

� � , and the assumption that c

t

x

k

is bounded

below. 2

If c

t

x

k

is not bounded below, then hDP i does not have a feasible solution (inde-

pendently of the feasibility of hP i). Thus, we assume that c

t

x

k

is bounded below, or

equivalently due to Theorem 3.8, that x

k

converges to x

1

from now on. We de�ne the

index set N and B such that x

1

N

= 0 and x

1

B

> 0. By Proposition 3.1 and Corollary 3.6,

we see that z

k

also converges to z

1

and z

1

B

= 0. This implies that x

1

is contained in a

constant-cost face of hF i, i.e., a face of hF i on which w is constant. (To see this, recall

that for every k, z

k

= g�

~

A

t

u

k

for some u

k

because z

k

is determined by (10) with x = x

k

.

Taking limit of both sides, we see that z

1

= g �

~

A

t

u

1

for some u

1

. For any ~x in the

face fx 2 FeashF i jx

N

= 0 g, we have g

t

~x = (z

1

+

~

A

t

u

1

)

t

~x = (u

1

)

t

~

Ax = (u

1

)

t

b, which

implies that w = g

t

x is constant on the face.)

The following property of the dual sequence fz

k

g is frequently used in the consecutive

analysis.

Lemma 3.10 Assume that z(x

k

) ! z

1

and z

1

J

6= 0 and z

1

J

c

= 0. Then there exists a

positive constant M such that

kz

k

� z

1

k �Mkx

k

J

k

2

: (39)

11

Proof : By de�nition (10), we see that g � z

1

2 Im (

~

A

t

), which implies that z

k

is a

solution of

(

min kX

k

zk

2

subject to z = z

1

�

~

A

t

y:

(40)

Therefore, we have

z

k

= z

1

�

~

A

t

(

~

A(X

k

)

2

~

A

t

)

�1

~

A(X

k

)

2

z

1

=

z

1

J

0

!

�

~

A

t

(

~

A(X

k

)

2

~

A

t

)

�1

~

A

(X

k

J

)

2

z

1

J

0

!

: (41)

Due to Proposition 3.1, the lemma readily follows. 2

Corollary 3.11 Assume that z(x

k

) ! z

1

and z

1

J

6= 0 and z

1

J

c

= 0. Then there exist a

number K and a positive constant M such that for k � K,

kz

k

J

c

k

kX

k

z

k

k

2

� M: (42)

Proof : Due to Lemma 3.10, there exists a positive constant M

1

such that kz

k

J

c

k �

M

1

kx

k

J

k

2

. We see that for j 2 J ,

jz

k

j

j � min

i2J

jz

1

i

j=2 > 0 (43)

holds if k is su�ciently large. For such k, we have

kx

k

J

k

2

=

X

j2J

(x

k

j

)

2

�

X

j2J

4(z

k

j

x

k

j

)

2

(min

i2J

jz

1

i

j)

2

�

4kX

k

J

z

k

J

k

2

(min

i2J

jz

1

i

j)

2

�

4kX

k

z

k

k

2

(min

i2J

jz

1

i

j)

2

: 2 (44)

Next we derive linear convergence of w to w

1

.

Lemma 3.12 Let " be any positive constant. If x

k

! x

1

, then, for su�ciently large k,

we have

w

k

� w

1

kX

k

z

k

k

= kP

AX

k(X

k

)

�1

(x

k

� x

1

)k �

q

jN j+ "; (45)

where P

AX

k

4

= X

k

A

t

(A(X

k

)

2

A

t

)

�1

AX

k

.

Proof : From Ax

k

� b = w

k

r and Ax

1

� b = w

1

r, we have

r =

A(x

k

� x

1

)

w

k

� w

1

: (46)

By de�nition and the above relation, z

k

is a solution of

(

min kX

k

z

k

k

subject to z = A

t

y; (x

k

� x

1

)

t

A

t

y = w

k

� w

1

:

(47)

12

Solving the Karush-Kuhn-Tucker condition of the above problem, we have

z

k

= (w

k

� w

1

)

A(A(X

k

)

2

A

t

)

�1

A(x

k

� x

1

)

kP

AX

k(X

k

)

�1

(x

k

� x

1

)k

2

: (48)

This implies that

kX

k

z

k

k =

(w

k

� w

1

)kP

AX

k(X

k

)

�1

(x

k

� x

1

)k

kP

AX

k(X

k

)

�1

(x

k

� x

1

)k

2

=

w

k

� w

1

kP

AX

k(X

k

)

�1

(x

k

� x

1

)k

: (49)

Therefore, we have

w

k

� w

1

kX

k

z

k

k

= kP

AX

k(X

k

)

�1

(x

k

� x

1

)k �

e

(X

k

B

)

�1

(x

k

B

� x

1

B

)

!

�

q

jN j+ " (50)

when k is su�ciently large, since lim

k!1

x

k

B

= x

1

B

> 0. This completes the proof. 2

Lemma 3.13 If x

k

converges, then we have �

k

w

k

=(w

k

� w

1

) � �

min

=(

q

jN j + 1) for

su�ciently large k.

Proof : In view of Lemma 3.12, we have

�

k

w

k

w

k

� w

1

=

�

k

w

k

(w

k

� w

1

)�

�X

k

s

k

�(X

k

s

k

)

+

w

k

X

k

z

k

kX

k

z

k

k

2

!

�

�

min

�

w

k

� w

1

w

k

+

(w

k

� w

1

)�(X

k

z

k

)

kX

k

z

k

k

2

�

�

min

� + (

q

jN j+ (1� �))

> 0 (51)

for su�ciently large k, which completes the proof. 2

Corollary 3.14 w

k

converges linearly to w

1

with a reduction rate of (1� �

min

=(

q

jN j+

1)).

4 The Case w

1

> 0

In this section, we treat the case where w

1

> 0, and prove that hP i is infeasible in this

case. We have the following theorem.

Theorem 4.1 If x

k

! x

1

and w

1

> 0, then x

1

is an optimal solution of hF i and z

1

is an optimal solution of hDF i, and x

1

and z

1

satisfy strict complemntarity.

13

Proof : We prove the theorem by contradiction. Suppose that there exists an index j 2 N

such that z

1

j

� 0.

We �rst show that there exists a positive constant M such that

z

k

j

kX

k

z

k

k

2

� M: (52)

If z

1

j

< 0, this relation is obvious since the lefthand side is negative for su�ciently large

k. If z

1

j

= 0, Corollary 3.11 directly produces (52).

By dividing the both sides of the iteration by x

k

j

and by using (52) and �

k

w

k

=(w

k

�

w

1

) � 1, we obtain

x

k+1

j

x

k

j

= 1�

�

k

w

k

w

k

� w

1

�x

k

j

s

k

j

w

k

�(X

k

s

k

)

+

x

k

j

z

k

j

kX

k

z

k

k

2

!

(w

k

� w

1

)

� 1� (�=w

1

+Mx

k

j

)(w

k

� w

1

): (53)

Since w

k

�w

1

! 0, there exists a number K such that (�=w

1

+Mx

k

j

)(w

k

� w

1

) < 1=2

for all k � K. By using the inequality

log(1� �) � �2� log 2 (1=2 � � � 0); (54)

we have

l

X

k=K

log

x

k+1

j

x

k

j

�

l

X

k=K

log(1� (�=w

1

+Mx

k

j

)(w

k

� w

1

))

� �

l

X

k=K

(2 log 2)(�=w

1

+Mx

k

j

)(w

k

� w

1

): (55)

Now if we let l ! 1, the righthand side remains �nite due to the linear convergence

of w

k

� w

1

(Corollary 3.14), while the lefthand side diverges to minus in�nity because

j 2 N . This is a contradiction, which completes the proof. 2

5 The Case w

1

= 0

In this section, we treat the case where w

1

= 0.

Theorem 5.1 If x

k

! x

1

and w

k

! 0, and if hP i has interior feasible solutions, then

the algorithm �nds an interior feasible solution of hP i in a �nite number of iterations.

Proof : From Lemma 3.12, we see that

w

k

kX

k

z

k

k

= kX

k

A

t

(A(X

k

)

2

A

t

)

�1

A(x

k

� x

1

)k; (56)

14

which converges to zero since lim

k!1

(A(X

k

)

2

A

t

)

�1

exists due to Proposition 3.2. There-

fore,

�

�X

k

s

k

�(X

k

s

k

)

+

w

k

X

k

z

k

kX

k

z

k

k

2

!

� � +

w

k

kX

k

z

k

k

(57)

is strictly smaller than 1 for su�ciently large k. This implies that by taking the step-size

�

k

= 1, an interior feasible solution is found in a �nite number of iterations. 2

Next we consider the case where hP i does not have an interior feasible solution. In

this case, the feasible region of hP i is an optimal face of hF i, and there exists an index

set F which is always active on FeashP i. By strict complementarity, we have an optimal

solution z

�

of hDF i satisfying z

�

F

> 0 and z

�

F

c

= 0, which is unique due to Proposition

3.1. In fact, we have the following property of z

k

.

Proposition 5.2 z

1

= z

�

.

Proof : Due to Proposition 3.1, the dual estimate z(x) is de�ned everywhere on FeashF i,

in particular, z(x) = z

�

if x is optimal for hF i. Since x

1

is optimal for hF i, we have the

proposition. 2

Due to Corollary 3.6, we see that z

�

B

= 0, i.e., F � N . Furthermore, we have

w

k

= (z

�

F

)

t

x

k

F

(58)

since, expressing z

�

= g �

~

A

t

~u for some ~u,

w

k

= g

t

(x

k

� x

1

) = (z

�

+

~

A

t

~u)

t

(x

k

� x

1

)

= (z

�

F

)

t

x

k

F

+ ~u

t

~

A(x

k

� x

1

) = (z

�

F

)

t

x

k

F

: (59)

For the dual estimate for optimality, the following property is easily seen.

Proposition 5.3 We have kX

k

s

k

k ! 0 and s

k

B

! 0.

Proof : Lemma 3.13 implies that there exists a positive constant � such that �

k

� � for

all k. Thus Corollary 3.9 implies kX

k

s

k

k ! 0. Since x

k

B

! x

1

B

> 0, we see that s

k

B

! 0. 2

The following theorem deals with a special case where F = N .

Theorem 5.4 If F = N , then x

1

is an optimal solution of hP i, and x

1

and any accu-

mulaion point of fs

k

+Mz

k

g where M is a su�ciently large scalar satisfy strict comple-

mentarity condition for hP i and hDP i. In this case, the objective function c

t

x is constant

on FeashP i.

Proof : Let s be an accumulation point of fs

k

g, whose existence is ensured by Lemma

3.4. We see from Proposition 5.3 that s

B

= 0. Since z

�

2 Im (A

t

), s +Mz

�

satisfy the

equality constraint of hDP i for arbitrary scalar M , thus if M is su�ciently large, this

becomes feasible for hDP i. Since s

B

+Mz

�

B

= 0 and x

1

N

= 0, these are optimal solutions

of hDP i and hP i, respectively, satisfying strict complementarity.

15

Let y be a solution of s+Mz

�

= c � A

t

y. Since x

F

= x

N

= 0 for any x 2 FeashP i,

we have

c

t

x = (s+Mz

�

+ A

t

y)

t

x = y

t

b; (60)

which shows that the objective function is constant on FeashP i. 2

This theorem implies that if F = N , then the optimal face of hP i is identical to the

feasible region of hP i which is the optimal face of hF i, thus x

1

is also a relative interior

point of the optimal face of hF i.

Now we assume that F is a proper subset of N in the following analysis.

Though Proposition 5.3 implies that s

k

B

converges to 0, the dual estimates fs

k

g itself

does not converge. Roughly speaking, the di�culty of the analysis lies in this point. To

overcome this, we introduce the following linear programming problem:

h

~

P i

(

min

(x;w)

c

t

x+ 0

t

w

subject to Ax� wr = b; x � 0:

(61)

Let us consider the dual estimate ~s(x) for the above problem, which is de�ned as

~s(x)

4

= argmin

~s=c�

~

A

t

y

kX~sk: (62)

We abbreviate ~s(x

k

) by ~s

k

. The sequence f~s

k

g converges since Assumption 1 implies that

h

~

P i is nondegenerate. Let ~s

1

be the limit. Now we have an important relation.

Lemma 5.5 We have

~s

k

= s

k

+ r

t

y

k

z

k

(63)

where s

k

= c� A

t

y

k

is the dual estimate for hP i and y

k

= (A(X

k

)

2

A

t

)

�1

A(X

k

)

2

c.

Proof : Putting

~

P

4

= I �

rr

t

r

t

r

; (64)

in view of (8), we see that ~s

k

is the optimal solution of

(

min kX

k

sk

2

subject to s = c� A

t

~

Py:

(65)

From Karush-Kuhn-Tucker condition, we have

~

PA(X

k

)

2

(c� A

t

~

Py) = 0: (66)

Since

~

P is a projection matrix onto Null (r

t

), putting v

4

=

~

Py we have

A(X

k

)

2

A

t

v = A(X

k

)

2

c+ �r; (67)

where � is a scalar, from which

v = (A(X

k

)

2

A

t

)

�1

(A(X

k

)

2

c+ �r) = y

k

+ �(A(X

k

)

2

A

t

)

�1

r (68)

16

follows. Now since r

t

v = 0, we have

� =

� r

t

y

k

r

t

(A(X

k

)

2

A

t

)

�1

r

(69)

and

~s

k

= c� A

t

v = c� A

t

y

k

+

r

t

y

k

A

t

(A(X

k

)

2

A

t

)

�1

r

r

t

(A(X

k

)

2

A

t

)

�1

r

: (70)

An analogous argument as above can be applied to derive z

k

with g in (6) instead of c to

produce

z

k

=

A

t

(A(X

k

)

2

A

t

)

�1

r

r

t

(A(X

k

)

2

A

t

)

�1

r

: (71)

From this and (70), we have (63). 2

Since s

k

is bounded due to Proposition 3.4, r

t

y

k

is also bounded. The relation (63)

implies that the limit set of fs

k

g lies on a line, whose direction is z

�

. This observation

plays important role in the sequel.

From the facts s

k

B

! 0, z

k

B

! 0, and that r

t

y

k

is bounded,

~s

1

B

= 0 (72)

follows. We will prove the following lemma.

Lemma 5.6 For j 2 N \ F

c

, we have ~s

1

j

� 0.

This lemma implies the following theorem.

Theorem 5.7 x

1

and

s

�

4

= ~s

1

+

max

j2F

j~s

1

j

j+ 1

min

j2F

z

�

j

z

�

(73)

are optimal solutions of hP i and hDP i, respectively.

Proof : We show that x

1

and s

�

satisfy a complementarity condition. Since z

�

2 Im (A

t

)

and (63) holds, s

�

satis�es the equality condition of hDP i. From (72) and the fact that

z

�

B

= 0, we see that s

�

B

= 0. Furthermore, Lemma 5.6 implies that s

�

F

c

\N

� 0, thus what

we need to show is s

�

F

� 0. This is easily obtained since for j 2 F ,

~s

1

j

+

max

j2F

j~s

1

j

j+ 1

min

j2F

z

�

j

z

�

j

� ~s

1

j

+max

j2F

j~s

1

j

j+ 1 > 0: 2 (74)

Proof of Lemma 5.6 : We can write the iteration by using ~s

k

as

x

k+1

= x

k

� �

k

�(X

k

)

2

�(X

k

s

k

)

(

~s

k

+

�r

t

y

k

+

w

k

�(X

k

s

k

)

�kX

k

z

k

k

2

!

z

k

)

: (75)

17

First we note that since z

k

F

! z

�

F

> 0, z

k

F

> 0 for su�ciently large k.

We prove the lemma by contradiction. Suppose contrary that there exists an index

j 2 N such that z

�

j

= z

1

j

= 0 and ~s

1

j

< 0. Due to Corollary 3.11 and the fact that r

t

y

k

is bounded (this follows from Proposition 3.4 and that y

k

is uniquely determined by s

k

),

we have

�

�

�

�

�

�r

t

y

k

+

w

k

�(X

k

s

k

)

�kX

k

z

k

k

2

�

�

�

�

�

jz

k

j

j � jr

t

y

k

z

k

j

j+

Mw

k

�(X

k

s

k

)

�

! 0; (76)

where M is a positive constant. Since ~s

1

j

< 0 by assumption, ~s

k

j

� ~s

1

j

=2 < 0 holds

for su�ciently large k. In view of (75), this and the convergence (76) imply that �x

k

j

is

negative for all such k. This contradicts the fact that x

k

j

! 0. 2

6 Strict Complementarity

Now we prove that x

1

and s

�

satisfy strict complementarity condition. First we observe

some important properties of the sequence. Note that, in this section, it is assumed that

F is a proper subset of N .

Lemma 6.1 For j 2 N � F , w

k

=x

k

j

converges to 0 asymptotically linearly.

Proof : Corollary 3.11 produces for j 2 N � F = N \ F

c

,

w

k

x

k

j

jz

k

j

j

kX

k

z

k

k

2

�Mx

k

j

w

k

! 0; (77)

where M is a positive constant. Thus, we have, for such j,

x

k+1

j

x

k

j

= 1� �

k

�x

k

j

s

k

j

�(X

k

s

k

)

+

w

k

x

k

j

z

k

j

kX

k

z

k

k

2

!

� 1� �

k

(1� �) (78)

for su�ciently large k and small constant � > 0. Therefore, we have

w

k+1

=x

k+1

j

w

k

=x

k

j

=

w

k+1

=w

k

x

k+1

j

=x

k

j

�

1� �

k

1� �

k

(1� �)

: (79)

This shows the asymptotic linear convergence of w

k

=x

k

j

since �

k

is bounded below by a

positive constant due to Lemma 3.13. 2

Lemma 6.2 There exists a positive constant � such that

c

k

x

k

� c

t

x

1

kx

k

N

k

� � (80)

for su�ciently large k.

18

Proof : Lemma 3.7 implies that there exist positive constants M

1

and M

2

such that

kx

k

N

k � kx

k

� x

1

k �

1

X

l=k

kx

l+1

� x

l

k �M

1

(c

t

x

k

� c

t

x

1

) +M

2

w

k

: (81)

Dividing both sides by kx

k

N

k and noting that w

k

=kx

k

N

k ! 0 due to Lemma 6.1, the lemma

readily follows. 2

Let G be the index set such that

~s

1

G

6= 0; ~s

1

G

c

= 0: (82)

By noting that G � N in view of (72), the following relation is readily obtained.

Proposition 6.3 We have c

t

x

k

� c

t

x

1

= (~s

1

G

)

t

x

k

G

+ w

k

(~y

1

)

t

r where ~y

1

be the unique

solution of ~s

1

= c� A

t

y.

Proof : This is easily obtained since

c

t

(x

k

� x

1

) = (~s

1

+ A

t

~y

1

)

t

(x

k

� x

1

) = (~s

1

G

)

t

x

k

G

+ (~y

1

)

t

A(x

k

� x

1

)

= (~s

1

G

)

t

x

k

G

+ w

k

(~y

1

)

t

r: 2 (83)

We show that G 6� F . Indeed, if G � F , then

c

t

x

k

� c

t

x

1

= (~s

1

G

)

t

x

k

G

+ (z

�

F

)

t

x

k

F

� (k~s

1

G

k+ kz

�

F

k)kx

k

F

k; (84)

from which, together with Lemma 6.2,

kx

k

N

k � Mkx

k

F

k �

M(z

�

F

)

t

x

k

F

min

j2F

z

�

j

�

Mw

k

min

j2F

z

�

j

(85)

follows for a positive constantM . On the other hand, from Lemma 6.1, we have w

k

=kx

k

N

k !

0, which, however, contradicts (85) because F is a proper subset of N . Therefore, we have

G 6� F .

Proposition 6.4 We have kx

k

F

k=kx

k

G

k ! 0.

Proof : Due to Lemma 6.1, we have

kx

k

F

k

kx

k

G

k

�

(z

�

F

)

t

x

k

F

(min

j2F

z

�

j

)kx

k

G

k

=

w

k

(min

j2F

z

�

j

)kx

k

G

k

! 0 2 (86)

19

Lemma 6.5 There exists a number K and a positive constant � such that for k � K,

(s

k

G

)

t

x

k

G

kx

k

G

k

� � (87)

holds.

Proof : Lemma 6.1, Lemma 6.2, and Proposition 6.3 imply that there exists a positive

constant �

1

such that

(~s

1

G

)

t

x

k

G

kx

k

G

k

�

c

t

x

k

� c

t

x

1

+ w

k

(~y

1

)

t

r

kx

k

N

k

� �

1

(88)

holds for su�ciently large k. Since

j(~s

k

G

� ~s

1

G

)

t

x

k

G

j

kx

k

G

k

� k~s

k

G

� ~s

1

G

k ! 0; (89)

we see from (88) that there exists a positive constant �

2

such that

(~s

k

G

)

t

x

k

G

kx

k

G

k

� �

2

(90)

holds for su�ciently largr k. Now we have

(s

k

G

)

t

x

k

G

kx

k

G

k

=

(~s

k

G

� r

t

y

k

z

k

G

)

t

x

k

G

kx

k

G

k

� �

2

� jr

t

y

k

j

(z

k

F

)

t

x

k

F

+ j(z

k

GnF

)

t

x

k

GnF

j

kx

k

G

k

� �

2

� jr

t

y

k

j(

kz

k

F

kkx

k

F

k

kx

k

G

k

+ kz

k

GnF

k)

� �

2

=2 > 0 (Use Proposition 6.4.) (91)


Corollary 6.6 There exists a positive constant � such that

�(X

k

G

s

k

G

)

kx

k

G

k

� � (92)

for su�ciently large k.

Proof : Let �

4

= X

k

G

s

k

G

=kx

k

G

k. Lemma 6.5 shows that for su�ciently large k, there exists

a positive constant �

1

such that e

t

� = (x

k

G

)

t

s

k

G

=kx

k

G

k � �

1

. This implies �(�) � �

1

=jGj

for such k. 2

20

Theorem 6.7 c

t

x

k

� c

t

x

1

converges to 0 asymptotically linearly.

Proof : Due to Lemma 6.2, we see that c

t

x

k

� c

t

x

1

is positive for su�ciently large k and

c

t

x

k+1

� c

t

x

1

c

t

x

k

� c

t

x

1

= 1�

�

k

�kX

k

s

k

k

2

=�(X

k

s

k

)

c

t

x

k

� c

t

x

1

� �

k

w

k

c

t

(X

k

)

2

z

k

=kX

k

z

k

k

2

c

t

x

k

� c

t

x

1

� 1� �

k

�kX

k

s

k

k

c

t

x

k

� c

t

x

1

+ �

k

w

k

kckM

1

c

t

x

k

� c

t

x

1

(93)

holds. Due to Lemma 10, we see that �

k

is bounded below by a positive constant �

2

, say.

The second term can be written as

�

k

�kX

k

s

k

k

c

t

x

k

� c

t

x

1

� �

2

�

kX

k

G

s

k

G

k

c

t

x

k

� c

t

x

1

� �

2

�

�(X

k

G

s

k

G

)kx

k

G

k

kx

k

G

k(c

t

x

k

� c

t

x

1

)

: (94)

From Corollary 6.6, �(X

k

G

s

k

G

)=kx

k

G

k � �

3

> 0 holds for a constant �

3

. Furthermore, from

Proposition 6.3, we have

c

t

x

k

� c

t

x

1

� k~s

1

G

kkx

k

G

k+ wjr

t

~y

1

j; (95)

from which, for su�ciently large k,

c

t

x

k

� c

t

x

1

kx

k

G

k

� k~s

1

G

k+ jr

t

~y

1

j

w

kx

k

G

k

� 2k~s

1

G

k (96)

follows due to Lemma 6.1. Therefore, we have

c

t

x

k+1

� c

t

x

1

c

t

x

k

� c

t

x

1

� 1� �

2

��

3

=(2k~s

1

G

k) + �

k

w

k

kckM

1

c

t

x

k

� c

t

x

1

: (97)

By using Lemmas 6.2 and 6.1, we have

w

k

c

t

x

k

� c

t

x

1

=

w

k

kx

k

G

k

kx

k

G

k(c

t

x

k

� c

t

x

1

)

! 0: (98)

This and (97) produces

c

k

x

k+1

� c

t

x

1

c

t

x

k

� c

t

x

1

� 1�

�

2

�

3

�

4k~s

1

G

k

(99)


The following lemma is easily shown by using the same argument as the proof of

Lemma 3.10, thus we omit the proof.

21

Lemma 6.8 There exists a positive constant M such that

k~s

k

� ~s

1

k �Mkx

k

G

k

2

: (100)

Now we prove the strict complementarity of x

1

and s

�

in (73).

Theorem 6.9 x

1

and s

�

satisfy strict complementarity.

Proof : Suppose that there exists an index j such that ~s

1

j

= 0 and z

�

j

= 0. From Lemmas

3.10 and 6.8, and Proposition 6.4,

s

k

j

= ~s

k

j

� r

t

y

k

z

k

j

�M

1

kx

k

G

k

2

+M

2

kx

k

F

k

2

� 2M

1

kx

k

G

k

2

(101)

follows for su�ciently large k and positive constants M

1

and M

2

. Then we have

x

k+1

j

x

k

j

= 1� �

k

�

x

k

j

s

k

j

�(X

k

s

k

)

+ w

k

x

k

j

z

k

j

kX

k

z

k

k

2

!

� 1� �

k

2x

k

j

M

1

kx

k

G

k

2

�(X

k

G

s

k

G

)

� �

k

w

k

x

k

j

M

3

(Use (101) and Corollary 3.11)

� 1�M

4

x

k

j

kx

k

G

k �M

3

x

k

j

w

k

(Use Corollary 6.6)

� 1�M

5

x

k

j

(c

t

x

k

� c

t

x

1

)�M

3

x

k

j

w

k

(Use Lemma 6.2) (102)

where M

3

, M

4

, and M

5

are some positive constants. This implies that there exists a

number K such that the righthand side of (102) is greater than 1=2 for k � K. Now we

have

l

X

k=K

log x

k+1

j

=x

k

j

�

l

X

k=K

log(1�M

5

x

k

j

(c

t

x

k

� c

t

x

1

)�M

3

x

k

j

w

k

)

� �2(log 2)

l

X

k=K

(M

5

x

k

j

(c

t

x

k

� c

t

x

1

)�M

3

x

k

j

w

k

): (103)

If we let l ! 1, the lefthand side diverges to minus in�nity while the righthand side

remains �nite due to linear convergence of c

t

x

k

� c

t

x

1

and w

k

. This is a contradiction,

and we conclude that there is no index such that ~s

1

j

= 0 and z

�

j

= 0.

This means that for j 2 N � F , ~s

j

6= 0. We however, already proved in Lemma 5.6

that ~s

j

6< 0 in this case. Therefore, we have ~s

j

> 0 for j 2 N � F , which shows the strict

complementarity of x

1

and s

�

. 2

7 Main Theorems

Collecting the results obtained so far, we prove the main theorems.

Theorem 7.1 Let fx

k

g be a sequence generated by the algorithm with �(x

k

) = �kX

k

s

k

k=

�(X

k

s

k

); � 2 (0; 1) and 0 < �

min

� �

k

< 1. Then the following situation occurs depending

on feasibilities of the primal problem and the dual problem.

22

1. If both primal and dual are feasible ([C1]), x

k

converges to a relative interior point

of the optimal face. In particular, if hP i has an interior feasible point, then x

k

becomes feasible in a �nite number of iterations.

2. If primal is infeasible and dual is feasible ([C2]), x

k

converges to an infeasible point

(More precisely, x part of a relative interior point of the optimal face of the linear

programming problem (3) to �nd a feasible point of (1)).

3. If primal is feasible and dual is infeasible ([C3]), c

t

x

k


4. If both primal and dual are infeasible ([C4]), there exist two possibilities, namely,

x

k

converges to an infeasible point which is characterized in the analysis of [C2], or

c

t

x

k


Proof : The former part of the �rst statement follows as a direct consequence of Theorems

5.4, 5.7 and 6.9. The latter part of 1 follows from Theorem 5.1. The second statement is

readily seen by Theorem 4.1.

To see the third statement, by contradiction, we assume hP i is feasible, hDP i is

infeasible, and c

t

x

k

does not diverge to minus in�nity. Then the sequence converges due

to Theorem 3.8. If hP i has an interior feasible solution, this implies that fx

k

g is a interior

feasible point for su�ciently large k. Then, the well-known theory of the a�ne scaling

algorithm under the nondegeneracy assumption Assumption 2 [7, 10, 16] implies that

fc

t

x

k

g diverges to �1, contradicting that fc

t

x

k

g is assumed to be bounded. If hP i does

not have an interior feasible solution, we see that x

1

has to be an optimal solution of hP i

due to Theorems 5.4 and 5.7, which implies that hDP i has an optimal solution. But this

contradicts the assumption that hDP i is infeasible.

Now we see the last statement. It is enough to show that the sequence behaves like

in the case of [C2] if c

t

x

k

is bounded. This result directly follows as a consequence of

Theorems 3.8 and 4.1. 2

Theorem 7.2 Under the same assumption as Theorem 7.1, the following holds for the

asymptotic behavior of the sequence fx

k

g and fs

k

g, fz

k

g.

1. If both primal and dual are feasible ([C1]) and primal has an interior feasible solu-

tion, x

1

and s

1

satisfy strict complementarity for hP i.

2. If both primal and dual are feasible ([C1]) but primal does not have an interior

feasible solution, then fz

k

g converges to a unique optimal solution of hDF i, and for

su�ciently large M , x

1

and every accumulation point of fs

k

+Mz

k

g satisfy strict

complementarity condition for hP i.

3. If primal is infeasible and dual is feasible ([C2]), then fz

k

g converges to a unique

optimal solution of hDF i.

Proof : The �rst statement readily follows from Theorem 5.1 and well-known convergence

results of the a�ne scaling algorithm under primal nondegeneracy assumption (see, e.g.,

23

Theorem 2.11 of [16]). The second statement follows from Theorem 5.4 if F = N , and oth-

erwise from Theorem 6.9 by taking note of the de�nition of s

�

, the relation s

k

= ~s

k

�r

t

y

k

z

k

,

and z

k

2 Im (A

t

). The third statement directly follows from Theorem 4.1. 2

Acknowledgement

The authors wish to thank Prof. K. Tanabe of the Institute of Statistical Mathematics for

stimulating discussion and his warm encouragement. This paper is a simpli�ed version

of the paper [17] which was originally submitted to this special issue. They thank the

editors of this special issue, Prof. K. Anstreicher and Prof. F. Freund, and two anonymous

referees for their valuable comments and suggestions in providing this version.

References

[1] I. Adler, N. K. Karmarkar, M. G. C. Resende and G. Veiga, An implementation

of Karmarkar's algorithm for linear programming, Math. Progr. 44 (1989) 297{335.

(Errata in Math. Progr. 50 (1991) 415.)

[2] I. Adler, N. K. Karmarkar, M. G. C. Resende and G. Veiga, Data structures and

programming techniques for the implementation of Karmarkar's algorithm, ORSA

J. Comput. 1 (1989) 84{106.

[3] I. Adler and R. D. C. Monteiro, Limiting behavior of the a�ne scaling continuous

trajectories for linear programming problems, Math. Progr. 50 (1991) 29{51.

[4] K. M. Anstreicher, A combined phase I-phase II projective algorithm for linear pro-

gramming, Math. Progr. 43, (1989) 425-453.

[5] E. R. Barnes, A variation on Karmarkar's algorithm for solving linear programming

problems, Math. Progr. 36 (1986) 174-182.

[6] I. I. Dikin, Iterative solution of problems of linear and quadratic programming,

Sov. Math. Doklady 8 (1967) 674-675.

[7] I. I. Dikin, About the convergence of an iterative process (in Russian), Upravlyaemye

Sistemi 12 (1974) 54-60.

[8] I. I. Dikin, The convergence of dual variables, Technical Report, Siberian Energy

Institute, Irkutsk, Russia (1991).

[9] I. I. Dikin and V. I. Zorkalcev, Iterative solution of mathematical programming prob-

lems (algorithms for the method of interior points), Nauka, Novosibirsk, USSR (1980).

24

[10] C. C. Gonzaga, Convergence of the large step primal a�ne-scaling algorithm for

primal non-degenerate linear programs, Technical Report, Department of Systems

Engineering and Computer Sciences, COPPE-Federal University of Rio de Janeiro,

Brazil (1990).

[11] L. Hall and R. Vanderbei, Two-thirds is sharp for a�ne scaling, Oper. Res. Lett. 13

(1993) 197-201.

[12] N: Karmarkar, A new polynomial-time algorithm for linear programming, Combina-

torica 4, No: 4 (1984) 373-395.

[13] M. Kojima, N. Megiddo and S. Mizuno, A primal-dual infeasible-interior-point algo-

rithm for linear programming, Math. Progr. 61 (1993) 263-280.

[14] W. Mascarenhas: The a�ne scaling algorithm fails for stepsize 0.999, Technical Re-

port, Universidade Estadual de Campinas, Campinas S.P., Brazil (1993). (To appear

in SIAM J. Optim.)

[15] S. Mizuno: Polynomiality of the Kojima-Megiddo-Mizuno infeasible interior point

algorithm for linear programming, Technical Report 1006, School of Operations Re-

search and Industrial Engineering, Cornell University, Ithaca, NY, USA (1992).

[16] R. D. C. Monteiro, T. Tsuchiya and Y. Wang, A simpli�ed global convergence proof

of the a�ne scaling algorithm, Ann. Oper. Res. 47, 443-482.

[17] M. Muramatsu and T. Tsuchiya, An a�ne scaling method with an infeasible starting

point, Research Memorandum, No.490, The Institute of Statistical Mathematics,

Tokyo, Japan (1993).

[18] R. Saigal, A simple proof of primal a�ne scaling method, Technical Report, Depart-

ment of Industrial and Operations Engineering, University of Michigan, Ann Arbor,

MI, USA (1993). (To appear in Ann. Oper. Res.)

[19] K. Tanabe, Feasibility-Improving Gradient-Acute-Projection methods: a uni�ed ap-

proach to nonlinear programming, Lecture Notes in Numerical Application and Anal-

ysis 3 (1981) 57-76.

[20] M. J. Todd, Combining phase I and phase II in a potential reduction algorithm for

linear programming, Technical Report No.907, Cornell University, Ithaca, NY, USA.

[21] P. Tseng and Z. Q. Luo, On the convergence of the a�ne scaling algorithm,

Math. Progr. 56 (1992) 301-319.

[22] T. Tsuchiya, Global convergence of the a�ne scaling methods for degenerate linear

programming problems, Math. Progr. 52 (1991) 377-404.

[23] T. Tsuchiya, Global convergence property of the a�ne scaling method for the primal

degenerate linear programming problems, Math. Oper. Res. 16 (1992) 527-557.

25

[24] T. Tsuchiya and R. D. C. Monteiro, Superlinear convergence of the a�ne{scaling

algorithm, Technical Report, Center for Research on Parallel Computation, Rice

University, Houston, USA (1992).

[25] T. Tsuchiya and M. Muramatsu, Global convergence of a long-step a�ne scaling

algorithm for degenerate linear programming problems, Research Memorandum,

No. 423, the Institute of Statistical Mathematics, Tokyo, Japan (1992). (To appear

SIAM J. Optim. 5, No. 3 (1995).)

[26] R. J. Vanderbei, M. S. Meketon and B. A. Freedman, A modi�cation of Karmarkar's

linear programming algorithm, Algorithmica 1 (1986) 395-407.

[27] R: J: Vanderbei and J: C: Lagarias, I: I: Dikin's convergence result for the a�ne-

scaling algorithm, Contemp. Math. 114, 109-119.

[28] Y. Zhang, On the convergence of an infeasible interior-point algorithm for linear

programming and other problems, SIAM J. Optim. 4, No. 1 (1994) 208-227,

26

An Affine Scaling Method with an Infeasible Starting Point: Convergence Analysis under Nondegeneracy...

Documents

Transcript of An Affine Scaling Method with an Infeasible Starting Point: Convergence Analysis under Nondegeneracy...