Newton's method for quadratic stochastic programs with recourse
-
Upload
independent -
Category
Documents
-
view
2 -
download
0
Transcript of Newton's method for quadratic stochastic programs with recourse
Newton's Method for Quadratic Stochastic Programswith Recourse 1Xiaojun Chen, Liqun Qi and Robert S. WomersleySchool of MathematicsUniversity of New South WalesP.O. Box 1, Kensington NSW 2033, Australia(March 1994)Abstract. Quadratic stochastic programs (QSP) with recourse can be for-mulated as nonlinear convex programming problems. By attaching a La-grange multiplier vector to the nonlinear convex program, a QSP is writtenas a system of nonsmooth equations. A Newton-like method for solving theQSP is proposed and global convergence and local superlinear convergenceof the method are established. The current method is more general thanprevious methods which were developed for box-diagonal and fully quadraticQSP. Numerical experiments are given to demonstrate the e�ciency of thealgorithm, and to compare the use of Monte-Carlo rules and lattice rules formultiple integration in the algorithm.Keywords: Newton's method, quadratic stochastic programs, nonsmoothequations.Short title: Newton's method for stochastic programs1This work is supported by the Australian Research Council.1
1. IntroductionLet P 2 Rn�n be symmetric positive semi-de�nite and H 2 Rm�m besymmetric positive de�nite. We consider two-stage quadratic stochastic pro-grams with �xed recourse [19,20]min 12xTPx+ cTx+ �(x)x 2 Rnsubject to Ax � b (1:1)where �(x) = Z (x; !)�(!)d!and (x; !) = max �12zTHz + zT (h(!)� Tx)z 2 Rmsubject to Wz � q:Here c 2 Rn; A 2 Rr�n; b 2 Rr; T 2 Rm�n; q 2 Rm1 and W 2 Rm1�m are�xed matrices, ! 2 Rm2 is a random vector with support � Rm2 , � is aprobability density function on Rm2 and h(�) 2 Rm is a random vector.By introducing a new variable y, an equivalent form of (1.1) ismin 12xTPx+ cTx+ �(y)x 2 Rn; y 2 Rmsubject to Ax � b (1:2)Tx� y = 0;where �(y) = Z g(y; !)�(!)d!g(y; !) = max �12zTHz + zT (h(!)� y)z 2 Rmsubject to Wz � q:Since H is symmetric positive de�nite, the function � is convex and oncecontinuously di�erentiable. Calculating � involves multi-dimensional inte-grals and quadratic programs. Problem (1.2) is useful because it is a convexprogram in which computational di�culties occur primarily in evaluation of� (m variables) and usually m� n. See [7, 8].2
Since it is impossible to demand the exact evaluation of the function �and its gradient, we consider approximate problems of the formmin 12xTPx+ cTx+ f(y)x 2 Rn; y 2 Rmsubject to Ax � b (1:3)Tx� y = 0where f(y) = NXi=1 �ig(y; !i)��(!i);g(y; !i) = max �12zTHz + zT (h(!i)� y)z 2 Rmsubject to Wz � q:The function �� involves � and a transformation used to go from the integralon to the unit cube [0; 1]m2 . The weights f�igNi=1 and points f!igNi=1 aregenerated by a multidimensional numerical integration rule. The �i; !i areindependent of y. In Section 4 we discuss both Monte-Carlo methods andlattice methods for approximating �. The aim is to develop methods whichare applicable to problems where the dimension m2 of the integral is large(� 5). Lattice methods [21, 4] are promising methods for multi-dimensionalintegration, and to our knowledge have not been used in stochastic program-ming before. In both Monte-Carlo and lattice methods equal weights arechosen, that is �i = 1N ; i = 1; 2; :::; N:Let X = fx 2 Rn j Ax � bgand Z = fz 2 Rm j Wz � qgbe nonempty polyhedra.Since H is symmetric positive de�nite, f is a di�erentiable convex func-tion de�ned in the whole space Rm.Problem (1.3) can be considered as an Extended Linear Quadratic Pro-gramming (ELQP) problem as introduced by Rockafellar and Wets [19, 20].If both P and H are positive de�nite, the problem is called fully quadratic.If both P and H are diagonal, and both X and Z are box regions de�nedby simple lower and upper bounds on the variables, the problem is called3
box-diagonal. Several numerical methods have been developed for solvingfully quadratic and box diagonal ELQP problems [14, 18-20, 23, 24]. How-ever most of these methods are less e�cient in the general case. Even whenthe problems are fully quadratic only linear convergence rates were estab-lished. Recently, Qi and Womersley [15] presented a sequential quadraticprogramming method for box-diagonal case and showed that the rate of con-vergence of their method is superlinear. Althrough the algorithm in [15] isnot in principal restricted to the box-diagonal case, it used explicit expressionfor derivative information which are only available in the box-diagonal case.When Z is a box, the problem corresponds to the simple recourse problem,which is relatively easy. The general case is typically very hard [5,12].In [15, 18-20, 24] the dual objective is evaluated to provide a dualitygap for a stopping criterion. The current algorithm does not evaluate thedual objective as this would involve the solution of a �rst stage quadraticprogramming problem with a potentially large number of variables.In this paper we present a new method which is e�cient in general caseand realizes both global convergence and superlinear convergence. Further-more, we do not need to calculate the dual problem.Assume that there exists an optimal solution (x�; y�) of (1.3). Accordingto Theorem 28.2 in [17], there exist optimal Lagrange multiplier vectors �� 2Rr and p� 2 Rm associated with the constraints Ax � b and Tx � y = 0,respectively. From the Kuhn-Tucker conditions for (1.3), we then have0 = Px� + c+ AT�� + T Tp�0 = 5f(y�)� p�0 = Tx� � y�Ax� � b; �� � 0; ��T (Ax� � b) = 0:Let � = (x; y; �; p) and M = n+ 2m+ r. Then the Kuhn-Tucker conditionsfor (1.3) can be stated as the following system of nonsmooth equations inthe variable � [10]:F (�) � 0BBB@ Px+ c+ AT�+ T Tp5f(y)� pmin(b� Ax; �)Tx� y 1CCCA = 0; (1:4)where \min" denotes the componentwise minimum operator on a pair ofvectors. The nonsmooth function F is a mapping from RM into itself.4
If �� is a solution of (1.4), then the Kuhn-Tucker conditions for (1.3)hold at (x�; y�). Since (1.3) is a convex program and the objective and theconstraints of (1.3) are C1 functions, by Theorem 9.4.2 in [3], (x�; y�) is aglobal solution to (1.3).In Section 2, we show that F is globally Lipschitz inRM . By Rademacher'stheorem, F is di�erentiable almost everywhere in RM . Let DF be the setwhere F is di�erentiable. We use the generalized Jacobian de�ned in [13]@BF (�) = f lim�k!��k2DF 5F (�k)g: (1:5)We give a V 2 @BF (�) for � 2 RM by using Pang's results on the projectionfunction [10].In Section 3, we use the generalized Jacobian V 2 @BF (�) given in Section2 to present an algorithm. The algorithm realizes global convergence andsuperlinear local convergence.Let jj � jj denote the Euclidean norm.When we get a solution of (1.3), two basic questions are raised: how goodof an estimate is the optimal value of (1.3) for the optimal value of (1.2), andhow well do the solutions of (1.3) approximate the solutions of (1.2) [6]. InSection 4, while solving (1.3) by our algorithm, we consider which integrationrule can provide a sharper estimate for j�(y)� f(y)j and minimize the num-ber of integrand evaluations. We give numerical experiments to demonstratethe e�ciency of our algorithm and to compare the use of Monte-Carlo rulesand lattice rules for multiple integration in the algorithm.2. Generalized JacobiansIn this section we show that rf is the sum of projection functions, soF is globally Lipschitz on RM . The Lipschitz property of F can be viewedas a consequence of the results [11]. However the following results empha-size the special structure in terms of projection functions which is used inthe Theorem 2.1 and the algorithm. Furthermore we give an element in thegeneralized Jacobian @BF (�) for � 2 RM .Proposition 2.1. Let !i 2 Rm2 be �xed, i = 1; 2; :::; N ,Qi(y) = �argmaxz f�12zTHz + (h(!i)� y)Tz : z 2 Zg5
and Q(y) = 1N NXi=1Qi(y)��(!i):Then Q(y) = 5f(y).Proof. Since H is positive de�nite, for any y 2 Rm; !i 2 Rm2 there exists aunique z�(y; !i) such that z�(y; !i) =argmaxf�12zTHz+(h(!i)� y)Tz : z 2Zg. By Theorem 27.1 in [17] on convex conjugate functions,g(y; !i) = maxz f�12zTHz + (h(!i)� y)Tz : z 2 Zgis di�erentiable at y and @g(y; !i)@y = �z�(y; !i):Hence we have5f(y) = 1N NXi=1 @g@y (y; !i)��(!i) = � 1N NXi=1 z�(y; !i)��(!i)= 1N NXi=1Qi(y)��(!i) = Q(y): 2Let '(x; �) =min(b� Ax; �). Using the construction in [13], we can givean element in the generalized Jacobian @B'(x; �): De�ne~A(x; �) = 0B@ ~a1(x; �):::~ar(x; �) 1CA 2 Rr�n and �(x; �) = 0B@ �1(x; �):::�r(x; �) 1CA 2 Rr�rwhere ~ai(x; �) = ( �ai if (b� Ax)i � �i0 otherwiseand �i(x; �) = ( ei if �i � (b� Ax)i0 otherwise6
for i = 1; 2; :::; r:Here ai is the i-th row of A and ei is the i-th row of the identity matrixI 2 Rr�r: Then ( ~A(x; �);�(x; �)) 2 @B'(x; �):De�nition 2.1. Let S be a nonempty, closed and convex subset of Rm: Theprojection of u 2 Rm onto the set S, denoted as �S(u) is the unique solutionmins2S jju� sjj:Let S be the polyhedronS = fs 2 Rm j Bs � qg;where B 2 Rm1�m. For any arbitrary vector u let �Bu denote the submatrixof B comprising of rows that correspond to the active inequalities Bs � q atthe vector �S(u). De�ne the polyhedral coneS(u) = fs j (�S(u)� u)T s = 0; �Bus � 0gand the lineality space of S(u)L(u) = S(u) \ (�S(u)):We summarize properties of the projection function which are used inthis paper.Lemma 2.1.(i) [16] �S is a contraction, i.e., it is Lipschitz with modulus 1.(ii) [10] �S is everywhere directionally di�erentiable along any direction and�0S(u; d) = �S(u)(d): Furthermore, for any vector h with jjhjj su�cientlysmall �S(u+ h) = �S(u) + �S(u)(h):(iii) [10] �S is F -di�erentiable at u if and only if �Bu�S(u) is identically zero;in this case, 5�S(u) = �L(u); (here �Bu�S(u) denotes the composite map:( �Bu�S(u))(d) = �Bu(�S(u)(d))):By Lemma 2.1 and Rademacher's theorem, �S is di�erentiable almosteverywhere. Let D�S be the set where �S is di�erentiable.7
Theorem 2.1. Let B = WH� 12 . ThenS = fs 2 Rm j Bs � qg = fs j s = H 12 z; z 2 Zgandi) Qi is di�erentiable almost everywhere;ii) Qi is F -di�erentiable at y if and only if �Bu�S(u) is identically equal tozero, where u = H� 12 (h(!i)� y);iii) @BQi(y) = H� 12 limuk!uuk2D�S f5�S(uk)gH� 12 = H� 12 limuk!uuk2D�S f�L(uk)gH� 12 ;where uk = H� 12 (h(!i)� yk):iv) Assume that W has full row rank. LetUi(y) = ( H�1 if H� 12ui 2 int ZH� 12�L(ui)H� 12 otherwise,where ui = H� 12 (h(!i) � y): Then Ui(y) 2 @BQi(y). Furthermore, F (�) isglobally Lipschitz in RM andV� = 0BBB@ P 0 AT T T0 U(y) 0 �I~A(x; �) 0 �(x; �) 0T �I 0 0 1CCCA 2 @BF (�); (2:1)where U(y) = 1N NXi=1Ui(y)��(!i) 2 @BQ(y): (2:2)Proof. Let �y be a vector inRm and let v = h(!i)��y. Then �z =argmaxf�12zTHz+zT v; z 2 Zg if and only if for any z 2 Z,(z � �z)T (v �H�z) � 0:Let s = H 12 z: Then �s = H 12 �z if and only if for any s 2 S,(s� �s)T (H� 12 v � �s) � 0:8
Let u = H� 12v. By the De�nition 2.1, �s is the projection of u onto S, i.e.�s = �S(u). Since H 12 �z = �s = �S(u) = �S(H� 12 v); we haveQi(�y) = ��z = �H� 12�S(H� 12 (h(!i)� �y)):Furthermore, if u 2 D�S ; then Qi is di�erentiable at �y andrQi(�y) = H� 12r�S(H� 12 (h(!i)� �y))H� 12 = H� 12�L(u)H� 12 :Hence (i)-(iii) follow from Lemma 2.1.iv) Since rf = Q is the sum of projection functions, rf is globally Lipschitzin Rm. Since '(x; �) is the componentwise minimum operator on a pair oflinear functions, ' is globally Lipschitz in Rn � Rr. Therefore F is globallyLipschitz in RM . By Rademacher's theorem, F is di�erentiable almost ev-erywhere in RM . Hence we can de�ne the generalized Jacobian @BF (�) forany � 2 RM . Now we prove (2.1).If �z = H�1v 2 int Z 6= ;, then u = H� 12v 2 int S and �Bu does not exist.Hence Qi is di�erentiable at �y and L(u) = Rm: This implies r�S(u) = Iand rQi(�y) = H�1.Now we consider the case H�1v =2 intZ. For any u 2 Rm, �L(u) is theprojection from Rm to the null space L(u) of the matrix (�S(u)� u)T�Bu ! :It is su�cient to prove that for any u 2 Rm;�L(u) 2 f limuk!uuk2D�S �L(uk)g;that is, there is a subsequence fukg � D�S such that�L(u) = limk!1�L(uk): (2:3)Let J0 = fi1; i2; :::; ij0g be the index set such that (Bu)ij = qij ; j = 1; 2; :::; j0:Since B has full row rank, there is an n�m matrix E such that BE(BE)T =I. Without loss of generality we may assume that BBT = I:Consider the case j0 = 0. In this case, there exists a neighborhood Nuof u such that for any u + h 2 Nu; �Bu = �Bu+h. Let �s 2 S(u). Then thereis a small positive number � such that �h = ��s 2 S(u) and u + �h 2 Nu.9
Furthermore, for any s 2 S(u),(�S(u+ �h)� (u+ �h))T s= (�S(u) + �S(u)(�h)� (u+ �h))T s= (�S(u)(�h)� �h)T s = (�h� �h)T s = 0:On the other hand, for any s 2 S(u+ �h),(�S(u)� u)T s= (�S(u+ �h)� �S(u)(�h)� (u+ �h� �h))T s= (��S(u)(�h) + �h)T s = (��h+ �h)T s = 0:Hence we have S(u) = S(u+ �h). By Lemma 2.1, we have�S(u+ �h) = �S(u) + �S(u)(�h)and �S(u) = �S(u+ �h) + �S(u+�h)(��h):Hence �h = �S(u)(�h) = ��S(u+�h)(��h) = ��S(u)(��h):Thus, ��h = ���s 2 S(u). Therefore, we must have �Bu�s = 0. Since �s is anyelement in S(u), we have �Bu�S(u) � 0: By Lemma 2.1, �S is di�erentiableat u, so that 5�S(u) = �L(u):Now we consider the case j0 � 1: Take a sequence fukg satisfyinguk = argmin� fjj� � ujj : (B�)i = (qk)i; i 2 J0g;where (qk)i = ( qi if i =2 J0qi + �k if i 2 J0,and �k > 0. The assumption that W has full row rank ensures that the setf� j (B�)i = (qk)i; i 2 J0g is not empty.Clearly �Buk = �Bu and (Buk)i 6= qi; i = 1; :::; m1. Hence fukg � D�Sand uk ! u as �k ! 0. Furthermore, since �S is globally Lipschitz,limuk!u�S(uk) = �S(u), we have (2.3) and H� 12�L(u)H� 12 2 @BQi(�y):Now we prove (2.2). Since Qi; i = 1; 2; :::; N are piecewise smooth, Qialmost everywhere di�erentiable. If all but at most one of the Qi are dif-ferentiable at y; then (2.2) holds. Hence it su�ces to prove that there two10
Qi1 and Qi2 are nondi�erentiable at y, the general case follows by induction.Let u1 = H� 12 (h(!i1) � y) and u2 = H� 12 (h(!i2) � y). Let ~B1 2 Rj1�m and~B2 2 Rj2�m denote the submatrices of B comprising of rows that Bu1 = qand Bu2 = q, respectively. Since B has full row rank, any row of ~B1~B2 ! islinearly independent or equal to others. Hence for any �k > 0 there is a 4uksuch that ~B1~B2 !4uk = �k~e; ~e = (1; 1; :::; 1) 2 Rj1+j2. Hence we may choosea sequence f4ukg such that 4uk ! 0 as k ! 1 and (B(u1 +4uk))i 6= qiand (B(u2 +4uk))i 6= qi; i = 1; :::; m1. It implies�L(u1+4uk) +�L(u2+4uk) = r�S(u1 +4uk) +r�S(u2 +4uk)= r(�S(u1 +4uk) + �S(u2 +4uk))and H� 12�L(u1)H� 12 +H� 12�L(u2)H� 12 2 @B(Qi1(y) +Qi2(y)): 2Remark 2.1. Let ~Bu = (�S(u)� u)T�Bu ! :Then �L(u) is a projection from Rm to the null space N ( ~Bu) of ~Bu: We cangive �L(u) by QR decomposition. Let ~Bu be an l�m matrix with rank( ~Bu) =r > 0. Let ~BTu = QR be a QR decomposition of ~BTu , where Q = (Q1; Q2)is an orthogonal matrix of order m � m (Q1 2 Rm�r; Q2 2 Rm�(m�r)) andR = ( �RT ; 0)T is an upper triangular matrix of order m� l ( �R 2 Rr�l). Thenwe have �L(u) = Q2QT2 :We can also obtain the projection from the singular value decomposition.De�nition 2.2 [13]. Let : Rm ! Rm be a locally Lipschitzian mapping.We say that is semismooth at y iflimU2@B(y+td0)d0!d;t#0 fUd0gexists for any d 2 Rm. 11
Proposition 2.2. IfH�1(h(!i)�y) =2 Z, then at least one matrix in @BQi(y)is singular.Proof. By Lemma 2.1, Qi is everywhere directionally di�erentiable alongany direction and �0S(u; d) = �S(u)d. Let u = H� 12 (h(!i)� y): Then u =2 S:Obviously, 0 2 S(u). Let d = �S(u)� u. Then d 6= 0 and sTd = 0 for anys 2 S(u). Hence �0S(u; d) = �S(u)d = 0:Since �S is piecewise smooth, �S is semismooth. By Lemma 2.1 in [13],there exists a U 2 @B�S(u) such that �0S(u; d) = Ud = 0. This implies thatU is singular and H� 12UH� 12 2 @BQi(y) is singular. 23. Algorithm and ConvergenceUsing the generalized Jacobian given in Theorem 2.1, we can solve theproblem (1.3) via nonsmooth equations (1.4) by some two-stage methodswhich realize global convergence and superlinear local convergence [13]. Inthis paper, we consider Newton's method (the sequential quadratic program-ming method) d (cf. [11, 14, 15]).Let u = (x; y)T and d = (dx; dy)T . Denote the objective function in (1.3)by �(u) = 12xTPx+ cTx+ f(y)and a quadratic function which is an approximation to �(u+ d) by�k(uk + d) = 5�(uk)Td+ 12dTx (P + �kI)dx + 12dTy U(yk)dy;where �k is a scalar.Algorithm 3.1. Choose �; � 2 (0; 1); � > 0; x0 2 X. Let y0 = Tx0. Fork � 0: Let �k > 0.1. Solve a quadratic program:minimize �k(uk + d)dsubject to A(xk + dx) � bTdx � dy = 0: (3:1)12
Let dk be the unique optimal solution of (3.1).2. Let �k and pk be the Lagrange multipliers at the solution of (3.1) cor-responding to A(xk + dx) � b and Tdx = dy; respectively. Let ~�k =(uk + dk; �k; pk)T : Calculate F (~�k). If jjF (~�k)jj � �, stop; otherwise go tostep 3.3. Let ik be the minimum integer i � 0 such that�(uk + �idk) � �(uk) + �2 �i5 �(uk)Tdk:Let uk+1 = uk + �ikdk:To consider the uniqueness of the solution of (1.3) and the superlinearconvergence rate of Algorithm 3.1, we use the de�nition of B-regularity from[11]. Let �̂(x) = �(x; Tx) = 12xTPx + cTx + f(Tx): Then the function�̂ is B-regular at x� if for any V (Tx�) 2 @BQ(Tx�); P + T TV (Tx�)T ispositive de�nite. Clearly, if P is positive de�nite, so the problem (1.3) isfully quadratic, then �̂ is B-regular at any point x 2 Rn.The following convergence theorem is based on papers [11, 14, 15]. Pa-per [14] proved the local superlinear convergence of an approximate Newtonmethod (cf. Algorithm 3.1 without step 3 and setting ik = 0) for solvingLC1 optimization problem. Paper [15] was an application of [14] to ELQPproblem. Paper [11] was a globalization of such a Newton method for LC1minimization problem.To consider the convergence of Algorithm 3.1, we set � = 0 in that algo-rithm.Theorem 3.1. (i) (Globally convergence) Let � and c0 be positive scalars.Suppose that �k 2 [0; �]; such that the smallest eigenvalue of P+T TU(yk)T+�kI is greater than c0 for all large k. Then every accumulation point of thesequence fukg produced by Algorithm 3.1, if it exists, is an optimal solutionof (1.3).(ii) (Local superlinear convergence) Suppose that the sequence fukg producedby Algorithm 3.1 has an accumulation point u�, U(yk) 2 @BQ(yk) and �̂ isB-regular at x�, then(a) u� is the unique optimal solution of (1.3).13
(b) Let f�kg be a sequence of positive scalars converging to zero. Thenthere exists an integer k0 such that for all k � k0; ik = 0; and the sequencefukg converges to u� at least Q-superlinearly, i.e.limk!1 jjuk+1 � u�jjjjuk � u�jj = 0: (3:2)Proof. By Theorem 2.1, function � is Fr�echet-di�erentiable in Rn+m and thegradient function r� is globally Lipschitz in Rn+m. Hence (1.3) is an LC1minimization problem. The quadratic program (3.1) is equivalent tominimize (Pxk + c+ T Trf(y))Tdx + 12dTx (P + �kI + T TU(yk)T )dxdxsubject to A(xk + dx) � b:Hence dk = (dxk ; T dxk)T is the unique optimal solution of (3.1). Furthermorefor any dx 2 Rn, dTx (P + �kI + T TU(yk)T )dx � c0jjdxjj2:By Theorem 2 in [11], (i) holds.Let (u) = 5�(u): By Theorem 2.1 and Lemma 2.1,0(u; d) = (Pdx; 1N NXi=1H� 12�S(ui)H� 12dy);where ui = H� 12 (h(!i)� y): Hence 0(u; �) is Lipschitzian. By Lemma 2.1 in[13], for any u 2 Rn+m, there is a V 2 @B(u) such that0(u; d) = V d:Hence, we can show that u� is the unique solution of (1.3) by the technicaldetail in Theorem 2.2 in [15].Since the projection function �S is piecewise smooth, �S is semismooth.Hence 5� is semismooth. By Theorem 3 in [11], we have (b). 2Remark 3.1. Since Q is piecewise smooth, U(y) 2 @BQ(y) holds almosteverywhere in Rm. If W has full row rank then U(y) 2 @BQ(y) holds.14
However in general, even if Z is a box, can have a degenerate case whereU(y) =2 @BQ(y).4. Numerical ExperimentsIn this section, we give three examples. The �rst example is chosen sothat the points where g has nonsmooth �rst derivatives can be analyticallydetermined. The second example is to demonstrate the e�ciency of Algo-rithm 3.1 for problem (1.3). The third example is given to test Algorithm 3.1with the use of Monte-Carlo methods and the lattice methods. The numer-ical experiments were obtained by using Matlab on a DEC 5000 work station.Example 1.Consider the problem (1.1) in which X = R2; H = I 2 R2�2; T = �H;P = 2 00 0 ! ; W = 1 12 �1 ! ; q = 33 ! :The point ~x� = (3; 1)T is made the optimal solution of (1.3) by taking ~y� =T ~x� andc = �P ~x� � 1N T T NXi=1rg(~y�; !i)��(!i) = �9:02902247175663�0:98984189153187 ! ;where the probability density is the normal density, h(!) = ! and pointsf!igNi=1 are generated by the lattice rule in R2 with N = 80; 044.Let v = ! � Tx: The value of the functiong(v) = maxz f�12zTHz + vT z; z 2 Zgis attained at the pointz�(v) = 8>>>><>>>>: v if v 2 R1 = fv j Wv � qg(2; 1) if v 2 R2 = fv j �Wv � �qgv � c2jjw2jj2w2 if v 2 R3 = fv j (Wv)2 � q2 & ( �Wv)2 � �q2gv � c1jjw1jj2w1 if v 2 R4 = fv j (Wv)1 � q1 & ( �Wv)1 � �q1; gwhere c = Wv � q; w1 and w2 are the rows of W and�W = 1 �11 2 ! and �q = 14 ! :15
Figure 1 plots g(v) and the two components of 5g(v) for v 2 D =~x� + [�4; 4]� [�4; 4] with N = 4952.-2
0
2
4
0 5
g(v)
*
-2
0
2
4
0 5
+-1
+-0.5
+0
+0.5
+1+1.5
First component z*(v)
*
-2
0
2
4
0 5
+-1.5+-1+-0.5
+0
+0.5
+1.5
+2
Second component z*(v)
*
g(v)Clearly, g(v) is once continuously di�erentiable, but it has no secondderivative. We choose 4 starting points in 4 regions: (�1;�1) 2 R1; (5; 2) 2R2; (3; 4) 2 R3 and (2;�2) 2 R4 and test Algorithm 3.1 for solving 4 problems(1.3) with 4 approximate solutions ~x�: (3; 1) 2 R2; (3; 2) 2 R2 \ R3; (1; 2) 2R1 \ R3 and (2; 1) 2 R1 \ R2 \ R3 \ R4. The numerical results with theconvergence criterion jjF (�k)jj � 5� 10�7 are shown in Table 1.Example 2. Let n = 20; r = 8; m = 4; m1 = 2 and N = 10; 000: MatricesA 2 Rr�n; b 2 Rr; c 2 Rn; T 2 Rm�n; q 2 Rm1 ;W 2 Rm1�m; P 2 Rn�n;H 2 Rm�m and h(!i) 2 Rm; �(!i); i = 1; :::; N are randomly selected.Meanwhile a solution of (1.3) with these matrices is generated. The nu-merical results with a random starting point and the convergence criterionjjF (�k)jj � 10�8 are shown in Figure 2.Example 3. Since the error occurs from (1.2) to (1.3) only from numerical16
Table 1: The iteration number k, jjxk � ~x�jj jj�(xk)� �(~x�)jj and jjF (�k)jj.~x� x0 k jjxk � ~x�jj jj�(xk)� �(~x�)jj jjF (�k)jj(3,1) (5,2) 6 3:0042� 10�5 2:5601� 10�6 2:0729� 10�8(3,4) 7 3:0005� 10�5 2:5601� 10�6 1:9755� 10�8(-1,-1) 8 3:0039� 10�5 2:5601� 10�6 1:7920� 10�8(2,-2) 6 3:0222� 10�5 2:5601� 10�6 2:1935� 10�7(3,2) (5,2) 6 4:5650� 10�6 2:8733� 10�6 1:4179� 10�7(3,4) 7 4:8014� 10�6 2:8733� 10�6 1:2249� 10�7(-1,-1) 8 4:3170� 10�6 2:8733� 10�6 4:1988� 10�7(2,-2) 7 4:9945� 10�6 2:8732� 10�6 3:3767� 10�7(1,2) (5,2) 12 3:8905� 10�5 5:1846� 10�6 3:6060� 10�7(3,4) 13 3:9696� 10�5 5:1846� 10�6 4:7172� 10�7(-1,-1) 15 3:9460� 10�5 5:1846� 10�6 2:2323� 10�7(2,-2) 14 3:8933� 10�5 5:1846� 10�6 3:3142� 10�7(2,1) (5,2) 7 5:2190� 10�5 6:5892� 10�8 1:3667� 10�7(3,4) 8 5:1899� 10�5 6:5892� 10�8 1:8472� 10�7(-1,-1) 9 5:2194� 10�5 6:5892� 10�8 1:4085� 10�7(2,-2) 8 5:1994� 10�5 6:5892� 10�8 7:9980� 10�8
17
21. R.J.B. Wets, Stochastic programming: Solution techniques and ap-
proximation schemes, in: Mathematical Programming, The State of
the Art-Bonn 1982, eds. by A. Bachem, M. Gr�otschel and B. Korte
(Springer-Verlag, Berlin, 1983) 566-603.
22. C. Zhu and R.T. Rockafellar, Primal-dual projected gradient algo-
rithms for extended linear-quadratic programming, SIAM J. Optimiza-
tion (to appear).
22
10. A. Prekopa and R.J.-B. Wets, Preface of Stochastic Programming 84,
Mathematical Programming Study 28.
11. L. Qi, Convergence analysis of some algorithms for solving nonsmooth
equations, Mathematics of Operations Research 18(1993) 227-244.
12. L.Qi, Superlinearly convergent approximate Newton methods for LC
1
optimization problems, to appear in Mathematical Programming.
13. L. Qi and R.Womersley, An SQP algorithm for extended linear-quadratic
problems in stochastic programming, Applied Mathematics Preprint
AM92/23, School of Mathematics, the University of New South Wales,
Sydney, Australia.
14. S.M. Robinson, An implicit-function theorem for a class of nonsmooth
equations, Mathematics Operation Research 16(1991) 292-309.
15. R.T. Rockafellar, Convex Analysis, Princeton University Press, Prince-
ton, NJ, 1970.
16. R.T. Rockafellar, Computational schemes for solving large-scale prob-
lems in extended linear-quadratic programming, Mathematical Pro-
gramming 48(1990) 447-474.
17. R.T. Rockafellar and R.J.-B.Wets, A Lagrangian �nite-generation tech-
nique for solving linear-quadratic problems in stochastic programming,
Mathematical Programming Study 28(1986) 63-93.
18. R.T. Rockafellar and R.J.-B. Wets, Linear-quadratic problems with
stochastic penalties: the �nite generation algorithm, in V.I. Arkin, A.
Shiraev and R.J-B. Wets, eds., Stochastic Optimization, Lecture Notes
in Control and Information Sciences 81 (Springer-Verlag, Berlin, 1987)
545-560.
19. I.H. Sloan, Numerical integration in high dimensions-the lattice rule
approach, Numerical Integration (1992) 55-69.
20. P. Tseng, Applications of a splitting algorithm to decomposition in
convex programming and variational inequalities, SIAM J. Control and
Optimization 29(1991) 119-138.
21
jjx
N+20
� x
N
jj
N 200 400 600 800 1000 1200
Monte-Carlo
Lattice
References
1. J.R. Birge and R.J.-B. Wets, Designing approximation schemes for
stochastic optimization problems, in particular, for stochastic programs
with recourse, Mathematical Programming Study 27(1986) 54-102
2. R. Fletcher, Practical Methods of Optimization, Second Edition, John
Wiley & Sons, Ltd, 1987.
3. S. Joe and I.H. Sloan, Imbedded lattice rules for multidimensional in-
tegration, SIAM J. Numerical Analysis 29(1992) 1119-1135.
4. P. Kall, Stochastic programming - an introduction, Sixth International
Conference on Stochastic Programming, Italy, 1992.
5. Y.M. Kaniovski, A.J. King and R.J.-B. Wets, Probabilistic bounds (via
large deviations) for the solutions of stochastic programming problems,
preprint (1993).
6. L. Nazareth and R.J.-B. Wets, Algorithms for stochastic programs:
The case of nonstochastic tenders, Mathematical Programming Study
28(1986) 1-28.
7. L. Nazareth and R.J.-B. Wets, Nonlinear programming techniques ap-
plied to stochastic programs with recourse, in: Numerical Techniques
in Stochastic Programming, eds. by Y. Ermoliev and R. Wets, (Spring-
Verlag, Berlin, 1988) 95-119.
8. H. Niederreiter, Multidimensional numerical integration using pseudo
random numbers, Mathematical Programming Study 27(1986) 17-38.
9. J.S. Pang, Newton's method for B-di�erentiable equations, Mathemat-
ics of Operations Research 15 (1990) 311-341.
20
Let n = 20; r = 4;m = 4;m
1
= 2;m
2
= 2: Matrices A 2 R
r�n
; b 2
R
r
; c 2 R
n
; T 2 R
m�n
; q 2 R
m
1
;W 2 R
m
1
�m
; P 2 R
n�n
and H 2 R
m�m
are
randomly generaly. Let = [�; �]� [�; �] and let � be the normal density
as
�(!) =
1
p
2�jCj
1
2
expf�
1
2
(! � �)
T
C
�1
(! � �)g;
where � 2 R
2
is the mean value and C 2 R
2�2
is the coveriance matrix. Let
C = L
T
L be the Choleski factorization of C, � = jCj and ! = Lv+ �. Then
�(!) =
1
p
2��
2
exp(�
1
2
v
T
v):
We chose C = I; � = 0 and � = ��. Let ! = 2�u�� and u = t�
1
2�
sin2�t.
Then
(y) =
Z
�
��
Z
�
��
g(y; !)�(!)d!
= (2�)
2
Z
1
0
Z
1
0
g(y; �(2u� 1))�(�(2u � 1))du(=:
1
(y))
= (2�)
2
Z
1
0
Z
1
0
g(y; �(2t�
1
�
sin2�t� 1))
�(�(2t �
1
�
sin2�t� 1))(1 � cos2�t
1
)(1� cos2�t
2
)dt(=:
2
(y))
We use simple Monte Carlo method to value
1
(y). Select N seven-decimal
two-dimensional vectors at random from a uniform distribution in [0; 1] �
[0; 1].
We use simple lattice method to value
2
(y). Select N two-dimensional
vectors ffj
(1;2)
N
gg
N�1
j=0
. The braces indicate that each component of the vector
is to be replaced by its fractional part: that is f!g = ! � [!]. [!] denoting
the largest integer which does not exceed.
We comppare the use of the simple Monte-Carlo method and the simple
lattice method in Algorithm 3.2 by testing jjx
N+20
� x
N
jj with di�erent N .
The numerical results are shown in Table 1.
Acknowledgemets
The authors wish to thank S. Joe, T. Langtry and I.H. Sloan for dis-
cussions on multidimensional numerical integration. The authors are also
thankful to A.J. King and R.J.-B. Wets for their preprint.
19
x0=c
0
1
2
3
4
5 x106
0 5 10
Iterations
||fx|
|
10-9
10-6
10-3
100
103
106
0 5 10
Iterations||
x -
x* ||
0
0.5
1
1.5
2
2 4 6 8 10
Iterations
Rat
io ||
F(x
)||
0
2000
4000
6000
8000
10000
2 4 6 8 10
Iterations
||F(x
)||
mean value of the integrand sampled at points chosen from an appropriate
statistical distribution function. The methods are e�ective when integrand
function g(y; !)�(!) is smooth with respect to !. However, the methods
do not converge very fast with the rate of convergence being O(
p
N). The
lattice methods are based on number theory. The methods converge faster
and have sharper error bound than Monte-Carlo methods. However, the in-
tegrand function g(y; !)�(!) is assumed to be 1-periodic in each of its m
2
variables and is assumed to be half-open unit cube [0; 1)
m
2
. Monte-Carlo
methods have been applied to stochastic programming recently. See [5].
However, it seems to that the lattice methods have never been used in this
area. We use a trasformation function ! = q(t) suggested by Sloan to rewrite
� by
�(z) =
Z
1
0
:::
Z
1
0
g(y; q(t))�(q(t))q
0
(t)dt;
where g(y; q(t))�(q(t))q
0
(t) satisfys these assumptions for lattice methods.
18
Zg is unique. Hence p
�
=5f(y
�
) = �
1
N
P
N
i=1
z(!
i
; y
�
) is unique, so �
�
is the
unique solution of (1.4).
Since A has full row rank, the Lagrange multipliers �
�
at (x
�
; y
�
) corre-
sponding to Ax � b is unique and the linear independence condition of (1.3)
holds at the unique Kuhn-Tucker point �
�
= (x
�
; y
�
; �
�
; p
�
)
T
. Furthermore
B-regularity implies that the second-order su�ciency condition of (1.3) is sat-
is�ed at �
�
. Therefore all the conditions of Lemma 3.1 are satis�ed. Hence
we have
lim
k!1
jjF (~�
k
)jj
jjF (~�
k�1
)jj
= 0:
It implies that there is k
0
such that for all k � k
0
, (3.7) holds. Therefore,
the sequence f�
k
g
1
k
0
generated by Algorithm 3.2 is same as it is generated by
using (3:6)
0
. By Lemma 3.1, Algorithm 3.2 converges superlinearly.
2
4. Numerical Experiments
In this section, we randomly generate problems (1.3) and problems (1.2)
with two dimensional integral. The �rst example is given to demonstrate the
e�ciency of Algorithm 3.2 for problem (1.3). The second example is given to
to test Algorithm 3.2 with the use of Monte-Carlo methods and the lattice
methods.
The numerical experiments were obtained by using a DEC 5000 work station.
Example 1. Let n = 20; r = 4; N = 200;m = 4;m
1
= 2: Matrices
A 2 R
r�n
; b 2 R
r
; c 2 R
n
; T 2 R
m�n
; q 2 R
m
1
;W 2 R
m
1
�m
; h(!) 2 R
m
; P 2
R
n�n
and H 2 R
m�m
are randomly generaly. Meanwhile a solution of (1.3)
with these matrices is generated. We used the trust region method to solve
the quadratic programming (3.6) at each step (see [2] for example). The
numerical results with starting point x
0
= c are shown in Figure 1.
Since the error occurs only from numerical integration (jj�(y) � f(y)jj)
and � is de�ned implicitly as the optimal value of an optimization prob-
lem, we consider choosing a numerical integration rule which o�ers savings
in the number of function evaluations and also o�ers the possibility of error
estimation. Monte-Carlo methods and lattice methods are two popular nu-
merical integration rules. Monte-Carlo methods are based on estimating the
17
Theorem 2.2 in [13], (x
�
; y
�
) is unique.
2
Lemma 3.1. Suppose that �
�
= (x
�
; y
�
; �
�
; p
�
) 2 R
n
� R
m
� R
r
� R
m
is a Kuhn-Tucker point of (1.3) and �
�
satis�es the second-order su�ciency
conditions, the strict complementarity condition and the linear independence
condition of (1.3). Then the approximate Newton method
minimize (Px
k
+ c)
T
d
x
+
1
2
d
T
x
(P + �I)d
x
+Q(y
k
)
T
d
y
+
1
2
d
T
y
U(y
k
)d
y
d
x
; d
y
subject to A(x
k
+ d
x
) � b
Td
x
� d
y
= 0: (3:6)
0
x
k+1
= x
k
+ d
x
k
; y
k+1
= y
k
+ d
y
k
:
is well-de�ned and �
k
= (x
k
; y
k
; �
k
; p
k
)
T
superlinearly convergent to �
�
. If
F (�
k
) 6= 0 with �
k
, and �
k
= (x
k
; y
k
; p
k
), then
lim
k!1
jjF (�
k
)jj
jjF (�
k�1
)jj
= 0:
Proof. By Proposition 2.1 and Theorem 2.1, 5f is semismooth. Since the
contraints of (1.3) are linear and
P 0
0 U(y
k
)
!
2 @
B
Px
k
+ c
5f(y
k
)
!
;
all conditions of Theorem 3.1 in [12] hold. By that theorem, Lemma 3.1 holds.
Theorem 3.3. Let � = 0 and � = 0 in Algorithm 3.2. Suppose that the
algorithm does not stop in a certain iteration. Suppose that the problem
(1.3) is B-regular and that A has full row rank. If the strict complementar-
ity condition holds for (1.3) at the unique Kuhn-Tucker point. Then there
is k
0
such that for all k � k
0
, (3.7) holds and the sequence f�
k
: k � k
0
g
converges superlinearly to the unique solution �
�
of (1.4).
Proof. By Theorems 3.1 and 3.2, a sequence f�
k
g generated by the algo-
rithm will converge to a solution �
�
of (1.4). Since (x
�
; y
�
) is unique and H
is symmetric positive de�ne, z(!
i
; y
�
)=argmax
z
f�
1
2
z
T
Hz+(h(!
i
)�y
�
)
z
; z 2
16
If jjF (~�
k
)jj � �, stop.
If k = 0 go to step 3.
If
jjF (~�
k
)jj
jjF (�
k�1
)jj
� �
1
; (3:7)
let x
k+1
= x
k
+ d
x
k
; y
k+1
= y
k
+ d
y
k
, go to iteration k + 1; otherwise, go to
step 3.
3. Let i
k
be the minimum integer i � 0 such that
�(x
k
+ �
i
d
x
k
; y
k
+ �
i
d
y
k
) � �(x
k
; y
k
)�
�
2
�
i
(d
T
x
k
(P + �I)d
x
k
+ d
T
y
k
U(y
k
)d
y
k
):
let x
k+1
= x
k
+ �
i
k
d
x
k
; y
k+1
= y
k
+ �
i
k
d
y
k
:
Theorem 3.2. Let � = 0 in Algorithm 3.2. If Algorithm 3.2 stops at kth it-
eration, then (~x
k
; ~y
k
) is an optimal solution of the problem (1.3). Otherwise,
if step 3 is used for all large k, then two in�nite sequences fx
k
g and fy
k
g are
generated by the algorithm, and the conclusions of Theorem 3.1 hold. Oth-
erwise, two in�nite sequences fx
k
; k 2 K
0
g and fy
k
: k 2 K
0
g are generated
by the algorithm, where K
0
= fk : step 3 is not applied at iteration kg: If x
�
is an accumulation point of fx
k
; k 2 K
0
g, then y
�
= Tx
�
is an accumulation
point of fy
k
; k 2 K
0
g and (x
�
; y
�
) is a solution of (1.3). If the problem (1.3)
is B-regular, then (x
�
; y
�
) is unique.
Proof. By the same argument as in the proof of Theorem 3.1, if the algo-
rithm stops at the kth iteration, (~x
k
; ~y
k
) is an optimal solution of (1.3). If
step 3 is used for all large k, then we have the same situation as Theorem
3.1. Therefore we assume that the algorithm does not stop at step 1 and
that step 3 is not always used for large k. Then a sequence f�
k
; k 2 K
0
g
is generated by the algorithm, where K
0
is de�ned in the statement of the
theorem. Assume that �
k
is an accumulation point of f�
k
; k 2 K
0
g. Let K
be a subsequence of K
0
such that f�
k
; k 2 Kg conveges to �
�
. By (3.7), we
have
jjF (�
�
)jj = lim
k2K
jjF (�
k
)jj = 0:
Hence Kuhn-Tucker conditions of (1.3) hold at (x
�
; y
�
). By of Theorem 9.4.2
in [2], (x
�
; y
�
) is a solution of (1.3) (cf. the proof of Theorem 3.1). By
15
To consider the superlinear convergence rate, we need the de�nition of
B-regularity [13]. Let v
�
= (x
�
; y
�
)
T
be an optimal solution of (1.3) and let
C
�
=
P 0
0 U(y
�
)
!
:
Let v = (x; y)
T
and
E(v) =
Px+ c
5f(y)
!
Let Y = fy j y = Tx; x 2 Xg and
X
�
� Y
�
= fx 2 X; y 2 Y : E(v
�
)
T
(v � v
�
) = 0g:
Then X
�
�Y
�
is called the critical face of X�Y in (1.3) and is independent
of the particular choice of x
�
[16]. The problem (1.3) is B-regular if
(v � v
�
)
T
C
�
(v � v
�
) > 0; 8 x 2 X
�
n fx
�
g;8 y 2 Y
�
n fy
�
g;
Clearly, if P + T
T
U(y
�
)T is positive de�nite, then the problem (1.3) is B-
regular. In particular if P is positive de�nite, so the problem (1.3) is fully
quadratic, then it is B-regular.
Using the technique established in [12, 13], we develop a two-stage algo-
rithm for solving problem (1.3). The �rst stage algorithm is globally con-
vergent. The second stage algorithm is superlinearly convergent under the
B-regularity condition and other conditions.
Algorithm 3.2. Choose �; � 2 (0; 1); � > 0; � > 0; 1 > �
1
> 0; �
0
2 D
satisfying y
0
= Tx
0
. For k � 0:
1. Solve a quadratic program:
minimize (Px
k
+ c)
T
d
x
+
1
2
d
T
x
(P + �I)d
x
+Q(y
k
)
T
d
y
+
1
2
d
T
y
U(y
k
)d
y
d
x
; d
y
subject to A(x
k
+ d
x
) � b
Td
x
� d
y
= 0: (3:6)
2. Let �
k
and p
k
be the Lagrange multipliers at the solution of (3.6) corre-
sponding A(x
k
+d
x
) � b and Td
x
= d
y
, respectively. Let ~�
k
= (~x
k
; ~y
k
; p
k
)
T
=
(x
k
+ d
x
k
; y
k
+ d
y
k
; p
k
)
T
: Calculate F (~�
k
) with �
k
(cf. (2.2)).
14
Algorithm 3.1. Choose �; � 2 (0; 1); � > 0; � > 0; �
0
2 D satisfying
y
0
= Tx
0
. For k � 0:
1. Solve a quadratic program:
minimize (Px
k
+ c)
T
d
x
+
1
2
d
T
x
(P + �I)d
x
+Q(y
k
)
T
d
y
+
1
2
d
T
y
U(y
k
)d
y
d
x
; d
y
subject to A(x
k
+ d
x
) � b
Td
x
� d
y
= 0: (3:5)
2. Let �
k
and p
k
be the Lagrange multipliers at the solution of (3.5) corre-
sponding A(x
k
+d
x
) � b and Td
x
= d
y
; respectively. Let ~�
k
= (~x
k
; ~y
k
; p
k
)
T
=
(x
k
+ d
x
k
; y
k
+ d
y
k
; p
k
)
T
: Calculate F (~�
k
) with �
k
(cf. (2.2)). If jjF (~�
k
)jj � �,
stop; otherwise go to step 3.
3. Let i
k
be the minimum integer i � 0 such that
�(x
k
+ �
i
d
x
k
; y
k
+ �
i
d
y
k
) � �(x
k
; y
k
)�
�
2
�
i
(d
T
x
k
(P + �I)d
x
k
+ d
T
y
k
U(y
k
)d
y
k
)
and let x
k+1
= x
k
+ �
i
k
d
x
k
; y
k+1
= y
k
+ �
i
k
d
y
k
:
Theorem 3.1. Let � = 0 in Algorithm 3.1. If Algorithm 3.1 stops at the
k-th iteration, then (~x
k
; ~y
k
) is an optimal solution of the problem (1.3). Oth-
erwise, two sequences fx
k
g and fy
k
g are generated by the algorithm. If x
�
is an accumulation point of fx
k
g, then y
�
= Tx
�
is an accumulation point of
fy
k
g and (x
�
; y
�
) is an optimal solution of problem (1.3).
Proof. If the algorithm stops at the kth iteration, then 0 = F (~�
k
) 2 L(~�
k
).
Therefore the Kuhn-Tucker conditions for (1.3) hold at (~x
k
; ~y
k
). By Propo-
sition 2.1 and Theorem 2.1, the objective function
1
2
x
T
Px+ c+ f(y) is con-
tinuously di�erentiable in R
n
� R
m
. Hence (1.3) is a convex programming
in which the objective and the constraints are C
1
functions. By Theorem
9.4.2 in [2], (~x
k
; ~y
k
) is an optimal solution of problem (1.3). If the algorithm
does not stop in a certain iteration, two in�nite sequences are generated. It
is a standard proof to show that an accumulation point (x
�
; y
�
) is an optimal
solution of (1.3) (see [13]).
2
13
In this section, we �rst consider how to implement the Newton-likemethod
(1.6) for problem (1.3) and give a practical algorithm. Next, we give a global
convergence theorem. Finally, we show the convergence rate of the algorithm
is superlinear.
Newton-like method (1.6) can be rewritten as
(
solve V
k
d+ F (�
k
) = 0 to get d
k
; (3:1)
let �
k+1
= �
k
+ d
k
:
Since V
k
is symmetric positive semi-de�nite, (3.1) is equivalent to
(
d
k
= argmix
d
fF (�
k
)
T
d +
1
2
d
T
V
k
dg
�
k+1
= �
k
+ d
k
: (3:2)
Let d = (d
x
; d
y
; d
p
). Consider (3.2) in
�
D, we have
8
>
<
>
:
d
k
= argmin
d
f(Px
k
+ c)
T
d
x
+
1
2
d
T
x
Pd
x
+Q(y
k
)
T
d
y
+
1
2
d
T
y
U(y
k
)d
y
+(p
k
+ 2d
p
)
T
(Td
x
� d
y
) + d
T
p
(Tx
k
� y
k
) : x
k
+ d
x
2 Xg
�
k+1
= �
k
+ d
k
: (3:3)
If we take an initial point �
0
2 D satisfying Tx
0
= y
0
and demand
Td
x
= d
y
at each step in (3.3), then we have
minimize (Px
k
+ c)
T
d
x
+
1
2
d
T
x
Pd
x
+Q(y
k
)
T
d
y
+
1
2
d
T
y
U(y
k
)d
y
d
x
; d
y
subject to A(x
k
+ d
x
) � b
Td
x
� d
y
= 0:
x
k+1
= x
k
+ d
x
k
; y
k+1
= y
k
+ d
y
k
: (3:4)
Since P is positive semi-de�nite, the singularity happens when we run
(3.4). To ensure that the algorithm is well-de�ned, we modify (3.4) by using
P + �I where � is a positive number in the generalized Jacobian V
k
. To
have global convergence, we add a line search to (3.4). Denote the objective
function in (1.3) by �(x; y), i.e.
�(x; y) =
1
2
x
T
Px+ c
T
x+ f(y):
12
Obviously, 0 2 S(u). Let d = �
S
(u) � u. Then d 6= 0 and s
T
d = 0 for any
s 2 S(u). Hence �
0
S
(u; d) = �
S(u)
d = 0:
Since �
S
is piecewise smooth, �
S
is semismooth. By Lemma 2.1 in [11],
there exists a U 2 @
B
�
S
(u) such that �
0
S
(s; d) = Ud = 0. This implies that
U is singular and H
�
1
2
UH
�
1
2
2 @
B
Q
i
(y) is singular.
2
Proposition 2.3. For any � 2
�
D, there is a unique �x = A
T
� 2 R
n
such
that
F (�) =
0
B
@
Px+ c+ T
T
p +A
T
�
5f(y)� p
Tx� y
1
C
A
; (2:2)
where � � 0; �
T
(Ax� b) = 0. Furthermore, if A has full row rank, then such
� in (2.2) is unique.
Proof. By the de�nition, F (�) = mL(�) and
jjmL(�)jj = argminfjj� jj j � 2 L(�)g:
Let � = �(Px+ c+ T
T
p): Since X is a nonempty polyhedra, @�
X
(x); x 2 X
is a nonempty closed and covex set, so L(�) is. By separation theorems on
convex sets [15], there is a unique F (�) = mL(�) 2 L(�) with smallest norm
and
jjF (�)jj
2
= minimize jj�� �jj
2
+ jj 5 f(y)� pjj
2
+ jjTx� yjj
2
� 2 @�
X
(x)
= jj�
@�
X
(x)
(�)� �jj
2
+ jj 5 f(y)� pjj
2
+ jjTx� yjj
2
= jjA
T
�� �jj
2
+ jj 5 f(y)� pjj
2
+ jjTx� yjj
2
:
Therefore (2.2) holds. If A has full row rank, then � is the unique solution
of A
T
� = �
@�
X
(x)
(�) = �x.
2
3. Algorithm and Convergence
11
where �
k
> 0. Then u
k
=2 S and for any u
k
, there is a neighborhood N
u
k
of
u
k
such that
�
B
u
k
=
�
B
u
k
+h
for any u
k
+h 2 N
u
k
. By the discussion above, �
S
is di�erentiable at u
k
and 5�
S
(u
k
) = �
L(u
k
)
. Letting �
k
! 0 and u
k
! u,
then
lim
u
k
!u
�
L(u
k
)
= �
L(u)
2 @
B
�
S
(u):
Hence H
�
1
2
�
L(u)
H
�
1
2
2 @
B
Q
i
(�y):
Since 5f = Q is globally Lipschitz in R
m
, F is globally Lipschitz in D.
Moreover, (2.1) holds.
2
Remark 2.1. Let
~
B
u
=
�
B
u
�
S
(u)� u
!
:
Then �
L(u)
is a projection from R
m
to the null space N (
~
B
u
) of
~
B
u
: We can
give �
L(u)
by QR decomposition. Let
~
B
u
be an l�mmatrix with rank(
~
B
u
) =
r > 0. Let
~
B
T
u
= QR be a QR decomposition of
~
B
T
u
, where Q = [Q
1
; Q
2
]
is an orthogonal matrix of order m � m (Q
1
2 R
m�r
; Q
2
2 R
m�(m�r)
) and
R = [
�
R; 0] is an upper triangular matrix of order m� l (
�
R 2 R
r�l
). Then we
have
�
L(u)
= Q
2
Q
T
2
:
We can also give the projection by singular value decomposition.
De�nition 2.2. [11]. Let : R
m
! R
m
be a locally Lipschitzian mapping.
We say that is semismooth at y if
lim
U2@
B
(y+td
0
)
d
0
!d;t#0
fUd
0
g
exists for any d 2 R
m
.
Proposition 2.2. IfH
�1
(h(!
i
)�y) =2 Z, then at least one matrix in @
B
Q
i
(y)
is singular.
Proof. By Lemma 2.1, Q
i
is everywhere directionally di�erentiable along
any direction and �
0
S
(u; d) = �
S(u)
d. Let u = H
�
1
2
(h(!
i
) � y): Then u =2 S:
10
Since H
1
2
�z = �s = �
S
(u) = �
S
(H
�
1
2
v) where v = h(!
i
)� �y, we have
Q
i
(�y) = ��z = �H
�
1
2
�
S
(H
�
1
2
(h(!
i
)� �y)):
Furthermore, if u 2 D
�
S
; then Q
i
is di�erentiable at �y and
5Q
i
(�y) = H
�
1
2
5�
S
(H
�
1
2
(h(!
i
)� �y))H
�
1
2
= H
�
1
2
�
L(u)
H
�
1
2
:
Hence (i)-(iii) follow from Lemma 2.1.
iv) If �z = H
�1
v 2 int Z, then u = H
�
1
2
v 2 int S and
�
B
u
� 0. Hence
Q
i
is di�erentiable at �y and L(u) = R
m
: This implies 5�
S
(u) = I and
5Q
i
(�y) = H
�1
.
If �z = H
�1
v is on the boundary of Z, then H
�1
2 @
B
Q
i
(�y) by passing to
a subsequence fy
k
: H
�1
(h(!
i
)� y
k
) 2int Zg:
Now we consider the case H
�1
v =2 Z. Since u = H
�
1
2
v =2 S, there exists
a neighborhood N
u
of u such that for any u+ h with jjhjj su�ciently small,
u+ h =2 S.
If there exists a neighborhood N
u
of u such that for any u + h 2 N
u
�
B
u
=
�
B
u+h
, then S(u) = S(u+ h). By Lemma 2.1, we have
�
S
(u+ h) = �
S
(u) + �
S(u)
(h)
and
�
S
(u) = �
S
(u+ h) + �
S(u+h)
(�h):
Hence
�
S(u)
(h) = ��
S(u+h)
(�h) = ��
S(u)
(�h):
Write s = �
S(u)
(h). Then both s and �s belong to S(u). Thus, we must
have
�
B
u
s = 0. Hence
�
B
u
�
S(u)
� 0. By Lemma 2.1, �
S
is di�erentiable
at u, so that Q
i
is di�erentiable at �y and 5Q
i
(�y) = H
�
1
2
5 �
S
(u)H
�
1
2
=
H
�
1
2
�
L(u)
H
�
1
2
.
If there is not a neighborhood of u such that
�
B
u
=
�
B
u+h
, then there is
an index set J
0
= fi
1
; i
2
; :::; i
j
0
g with j
0
� 1 such that (Bu)
i
j
= q
i
j
; j =
1; 2; :::; j
0
: Take a sequence fu
k
g such that
(Bu
k
)
i
=
(
(Bu)
i
if i =2 J
0
(Bu)
i
+ �
k
if i 2 J
0
,
9
where u = H
�
1
2
(h(!
i
)� y);
iii)
@
B
Q
i
(y) = H
�
1
2
lim
u
k
!u
u
k
2D
�
S
f5�
S
(u
k
)gH
�
1
2
= H
�
1
2
lim
u
k
!u
u
k
2D
�
S
f�
L(u
k
)
gH
�
1
2
;
where u
k
= H
�
1
2
(h(!
i
)� y
k
):
iv) Let
U
i
(y) =
(
H
�1
if H
�1
(h(!
i
)� y) 2 Z
H
�
1
2
�
L(u)
H
�
1
2
otherwise.
Then U
i
(y) 2 @
B
Q
i
(y). Furthermore, in D
F (�) =
0
B
@
Px+ c+ T
T
p
5f(y)� p
Tx� y
1
C
A
is globally Lipschitz and
V
�
=
0
B
@
P 0 T
T
0 U(y) �I
T �I 0
1
C
A
2
0
B
@
P 0 T
T
0 @
B
Q(y) �I
T �I 0
1
C
A
= @
B
F (�); (2:1)
where U(y) =
1
N
P
N
i=1
U
i
(y) and @
B
Q(y) =
1
N
P
N
i=1
@
B
Q
i
(y):
Proof. Let �y be a vector in R
m
and let v = h(!
i
) � �y. Then �z =argmax
z
f�
1
2
z
T
Hz + z
T
v; z 2 Zg if and only if for any z 2 Z,
(z � �z; v �H�z) � 0:
Let s = H
1
2
z: Then �s = H
1
2
�z if and only if for any s 2 S,
(s� �s;H
�
1
2
v � �s) � 0:
Let u = H
�
1
2
v. By the De�nition 2.1, �s is the projection of u onto S, i.e.
�s = �
S
(u).
Let
�
B
u
denote the submatrix of B comprising of rows that correspond to
the active constraints of the inequalities Bs � q at the vector �
S
(u).
8
where B 2 R
m
1
�m
.
For any arbitrary vector u let
�
B
u
denote the submatrix of B comprising
of rows that correspond to the active constraints of inequalities Bs � q at
the vector �
S
(u). De�ne the polyhedral cone
S(u) = fs : (�
S
(u)� u)
T
s = 0;
�
B
u
s � 0g
and the lineality space of S(u)
L(u) = S(u) \ (�S(u)):
We summarize properties of the projection function which are used in
this paper.
Lemma 2.1.
(i) [14] �
S
is a contraction, i.e., it is Lipschitz with modulus 1.
(ii) [9] �
S
is everywhere directionally di�erentiable along any direction and
�
0
S
(u; d) = �
S(u)
(d): Furthermore, for any vector h with jjhjj su�ciently
small
�
S
(u+ h) = �
S
(u) + �
S(u)
(h):
(iii) [9] �
S
is F -di�erentiable at a vector u if and only if
�
B
u
�
S(u)
is identi-
cally equal to zero; in this case, 5�
S
(u) = �
L(u)
; (here
�
B
u
�
S(u)
denotes the
composite map: (
�
B
u
�
S(u)
)(d) =
�
B
u
(�
S(u)
(d))):
By Lemma 2.1 and Rademacher's theorem, �
S
is di�erentiable almost
everywhere. Let D
�
S
be the set where �
S
is di�erentiable.
Theorem 2.1. Let B = WH
�
1
2
. Then
S = fs j Bs � q; s 2 R
m
g = fs j s = H
1
2
z; z 2 Zg
and
i) Q
i
is di�erentiable almost everywhere;
ii) Q
i
is F -di�erentiable at y if and only if
�
B
u
�
S(u)
� 0;
7
In this section, we show that F is globally Lipschitz in D and give the
generalized Jacobian @
B
F (�) for � 2 D.
Proposition 2.1. Let !
i
2 R
m
2
be �xed, i = 1; 2; :::; N ,
Q
i
(y) = �arg max
z
f�
1
2
z
T
Hz + (h(!
i
)� y)
T
z; z 2 Zg
and
Q(y) =
1
N
N
X
i=1
Q
i
(y):
Then Q(y) = 5f(y).
Proof. Since H is positive de�nite, for any y 2 R
m
; !
i
2 R
m
2
there exists a
unique z(y; !
i
) such that z(y; !
i
) =argmax
z
f�
1
2
z
T
Hz+(h(!
i
)�y)
T
z; z 2 Zg.
By Theorem 27.1 in [15] on convex conjugate functions,
g(�; !
i
) = max
z
f�
1
2
z
T
Hz + (h(!
i
)� �)
T
z; z 2 Zg
is di�erentiable at y and
@g
@y
(y; !
i
) = �z(y; !
i
):
Hence we have
5f(y) =
1
N
N
X
i=1
@g
@y
(y; !
i
) = �
1
N
N
X
i=1
z(y; !
i
) =
1
N
N
X
i=1
Q
i
(y) = Q(y):
2
De�nition 2.1. Let S be a nonempty, closed and convex subset of R
m
:
Then the projection of a point u 2 R
m
onto the set S, denoted as �
S
(u)
is de�ned as the solution (which must exist and be unique) to the following
mathematical program:
min
s2S
jju� sjj:
Let S be the polyhedron
S = fs : Bs � q; s 2 R
m
g;
6
consider the nonsmooth equations
F (�) = 0; � 2
�
D: (1:4)
If �
�
is a solution of (1.4), then the Kuhn-Tucker conditions for (1.3) hold
at (x
�
; y
�
). Since (1.3) is a convex programming and the objective and the
constraints of (1.3) are C
1
functions, by Theorem 9.4.2 in [2], (x
�
; y
�
) is a
global solution to (1.3).
In Section 2, we show that F is globally Lipschitz in D. By Rademacher's
theorem, F is di�erentiable almost everywhere inD. LetD
F
be the set where
F is di�erentiable. We use the generalized Jacobian de�ned in [11]
@
B
F (�) = f lim
�
k
!�
�
k
2D
F
5F (�
k
)g: (1:5)
We consider the generalized Newton's method
�
k+1
= �
k
� V
�1
k
F (�
k
); (1:6)
where �
0
2 D, V
k
2 @
B
F (�̂
k
) and
�̂
k
=
(
�
k
if �
k
2 D;
(x
0
; y
k
; p
k
)
T
otherwise:
In Section 2, we give the generalized Jacobian @
B
F (�) for � 2 D by using
Pang's results on the projection function [9].
In Section 3, we consider how to implement the method (1.6) and give
a practical algorithm. We present a global convergence theorem for the
algorithm and show that the convergence rate of the algorithm is superlinear.
When we get a solution of (1.3), two basic questions are raised: how good
of an estimate is the optimal value of (1.3) for the optimal value of (1.2) and
how well do the solutions of (1.3) approximate the solutions of (1.2) [5]. In
Section 4, while solving (1.3) by our algorithm, we consider which integration
rule can o�er a sharper error bound for jj�(y)� f(y)jj and save the number
of function evaluations. We give numerical experiments to demonstrate the
e�ciency of our algorithm and to compare the use of Monte-Carlo methods
and lattice methods in the algorithm.
2. Generalized Jacobian
5
ELQP [13,16-18,21, 22]. However most of them are less e�cient in the general
case. Even when the problems are fully quadratic, only linear convergence
rates were established. Recently, Qi and Womersely [13] presented a two-
stage sequential quadratic programming method for box-diagonal case and
showed that the rate of convergence of their method is superlinear.
These methods given in [13, 16-18, 22] used the di�erence between the
values of objective of (1.3) and its dual problem as a convergence criterion.
However it is too expensive to calculate both the primal and dual values at
each step for a general problem. When Z is a box the problem correponds
to the simple recourse problem, which is relatively easy. The general case is
typically very hard [4,10].
In this paper, we present a new method. The method is e�cient in gen-
eral case and realizes both global convergence and superlinear convergence.
Furthermore, we do not need to calculate the dual problem.
Throughout this paper, we use the Euclidean norm.
Assume that there exists an optimal solution (x
�
; y
�
) of (1.3). According
to Theorem 28.2 in [15], there exists an optimal Lagrange multiplier vector
p
�
2 R
m
associated with the constraint Tx� y = 0. From the Kuhn-Tucker
conditions for (1.3), we then have
0 2 Px
�
+ c+ @�
X
(x
�
) + T
T
p
�
0 = 5f(y
�
)� p
�
0 = Tx
�
� y
�
;
where �
X
is the indicator function of X. Since X is a convex set, �
X
is
an extended-valued convex function. By convex analysis [15], @�
X
(x) is the
normal cone to X at x, which is de�ned by
@�
X
(x) =
(
fA
T
� j � � 0; �
T
(Ax� b) = 0g if x 2 X;
; if x =2 X:
Let � = (x; y; p) and let L : R
2m+n
! R
2m+n
be the multifunction given by
L(�) =
0
B
@
Px+ c+ @�
X
(x) + T
T
p
5f(y)� p
Tx� y
1
C
A
:
Let m(L(�)) denote the element of L(�) with the smallest norm and let
F (�) = m(L(�)). Let D =intX � R
m
� R
m
and
�
D = X � R
m
� R
m
. We
4
computational di�culties occur primarily in evaluation of � (m variables)
and usually m� n. See [6, 7].
Since it is impossible to demand the exact evaluation of the function � and
its gradient, we consider optimal solution obtained by solving approximate
problems of the form:
minimize
1
2
x
T
Px+ c
T
x+ f(y)
x 2 R
n
; y 2 R
m
subject to Ax � b (1:3)
Tx� y = 0
where
f(y) =
N
X
i=1
�
i
g(y; !
i
);
g(y; !
i
) = maximize �
1
2
z
T
Hz + z
T
(h(!
i
)� y)
z 2 R
m
subject to Wz � q
and weights f�
i
g
N
i=1
and points f!
i
g
N
i=1
are generated by a multidimensional
numerical integration rule. We shall discuss in Section 4 how to approach
� by f , a numerical integration formula. We will use both Monte-Carlo
methods and lattice methods. In the two methods, the same weights are
chosen, that is, �
i
=
1
N
; i = 1; 2; :::; N . The lattice methods have not been
used in stochastic programming before.
Let
X = fx j Ax � b; x 2 R
n
g
and
Z = fz j Wz � q; z 2 R
m
g
be nonempty polyhedra. Assume that the interior of X is nonempty.
Since H is symmetric positive de�nite, f is a di�erentiable convex func-
tion de�ned in the whole space R
m
.
Problem (1.3) can be considered as the Extended Linear Quadratic Pro-
gramming (ELQP) model which was introduced by Rockafellar and Wets
[17, 18]. If both P and H are positive de�nite, the problem is called fully
quadratic. If both P and H are diagonal, and both X and Z are box regions
de�ned by simple lower and upper bounds on the variables, the problem is
called box-diagonal. Several numerical methods were developed for solving
3
1. Introduction
Let P 2 R
n�n
be symmetric positive semi-de�nite and H 2 R
m�m
be
symmetric positive de�nite. We consider the two-stage quadratic stochastic
programs with �xed recourse [17,18]:
minimize
1
2
x
T
Px+ c
T
x+ �(x)
x 2 R
n
subject to Ax � b (1:1)
where
�(x) =
Z
(x; !)�(!)d!
(x; !) = maximize �
1
2
z
T
Hz + z
T
(h(!)� Tx)
z 2 R
m
subject to Wz � q
c 2 R
n
; A 2 R
r�n
; b 2 R
r
; T 2 R
m�n
; q 2 R
m
1
and W 2 R
m
1
�m
are �xed ma-
trices, ! 2 R
m
2
is a random vector with support � R
m
2
, � is a probability
distribution function on R
m
2
and h(�) 2 R
m
is a random vector.
By introducing a new variable y, we obtain an equivalent form of (1.1) as
follows:
minimize
1
2
x
T
Px+ c
T
x+ �(y)
x 2 R
n
; y 2 R
m
subject to Ax � b (1:2)
Tx� y = 0;
where
�(y) =
Z
g(y; !)�(!)d!
g(y; !) = maximize �
1
2
z
T
Hz + z
T
(h(!)� y)
z 2 R
m
subject to Wz � q:
The function � is convex and smooth, since H is symmetric positive def-
inite. However � involves multi-dimensional integrals and quadratic pro-
grams. Problem (1.2) is useful because it is a convex program in which
2
Newton's Method for Quadratic Stochastic Programs
with Recourse via Nonsmooth Equations
1
Xiaojun Chen, Liqun Qi and Robert S. Womersley
School of Mathematics
University of New South Wales
P.O. Box 1, Kensington NSW 2033, Australia
(June 1993)
Abstract. Quadratic stochastic programs (QSP) with recourse can be for-
mulated as nonlinear convex programming problems. By attaching a La-
grange multiplier vector to the nonlinear convex program, we rewrite the
problem as a system of nonsmooth equations. We consider a Newton-like
method for solving the system and establish global convergence and local
superlinear convergence of the method. Several methods for special types of
QSP were developed. The new method is applicable for general QSP. Nu-
merical experiments are given to demonstrate the e�ciency of the algorithm
and to compare the use of Monte-Carlo rules and lattice rules for multiple
integration in the algorithm.
Keywords: Newton's method, quadratic stochastic programs, nonsmooth
equations.
Short title: Newton's method for stochastic programs
1
This work is supported by the Australian Research Council.
1
Figure 1:integration (jj�(y)�f(y)jj) and � is de�ned implicitly as the optimal value ofan optimization problem, we consider choosing a numerical integration rulewhich o�ers savings in the number of function evaluations and also o�ers thepossibility of error estimation. Monte-Carlo methods and lattice methods aretwo popular numerical integration rules. Monte-Carlo methods are based onestimating the mean value of the integrand sampled at points chosen from anappropriate statistical distribution function. The methods are e�ective whenthe integrand function g(y; !)�(!) is smooth with respect to !. However,the methods do not converge very fast with the rate of convergence beingO(N� 12 ). The lattice methods are based on number theory. The methodsconverge faster and have sharper error bound than Monte-Carlo methods.However, the integrand function is assumed to be 1-periodic in each of itsm2 variables and the integration region is understood to be the unit cube[0; 1)m2. Monte-Carlo methods have been applied to stochastic programming18
recently. See [6]. However, it seems that lattice methods have not been usedin this area before. We use a transformation function ! = q(t) suggested bySloan to rewrite � by�(z) = Z 10 ::: Z 10 g(y; q(t))�(q(t))q0(t)dt;where g(y; q(t))�(q(t))q0(t) is 1-periodic in each of its variables.Let = Rm2 and let � be the normal density as�(!) = 1(2�)m2=2jCj 12 expf�12(! � �)TC�1(! � �)g;where � 2 Rm2 is the mean value and C 2 Rm2�m2 is the covariance matrix.Let C = LTL be the Choleski factorization of C, � = jCj 12 and ! = L� + �.Then �(!) = 1(2�)m2=2� exp(�12�T�):Without loss of generality, we chose standard normal density, C = I and� = 0. Let ! = tanv (3:1)v = �u� �2 (3:2)and u = t� 12� sin2�t: (3:3)The sequence of transformations (3.1)-(3.3) is used to go from an integral onRm2 to the integral of a 1-periodic function on the unit cube [0; 1)m2 . Noteeach transformation is applied to each element of the vector arguments. Theuse of a lattice rule requires the integrand function to be 1-periodic and this isachieved by (3.2). A simple Monte Carlo method does not. Transformations(3.1)-(3.3) are used as follows:(y) = ZRm2 g(y; !)�(!)d!= Z �2��2 � � � Z �2��2 g(y; tanv)�(tanv) � 1cos2v1:::cos2vm2 dv19
= �m2 Z 10 � � � Z 10 g(y; tan(�u� �2 ))�(tan(�u� �2 ))� 1cos2(�u1 � �2 ):::cos2(�um2 � �2 )du (3:4)= �m2 Z 10 � � � Z 10 g(y; tan(�(t� 12� sin2�t)� �2 ))�(tan(�(t� 12� sin1�t)� �2 ))� (1� cos2�t1):::(1� cos2�tm2)cos2(�(t1 � 12� sin2�t1)� �2 ):::cos2(�(tm2 � 12� sin2�tm2)� �2 )dt: (3:5)A simple Monte-Carlo method [2] is used to approximate (3.4), by select-ing N points uniformly distributed in [0; 1]m2 for use in (1.3).A lattice rule [4, 21] is used to approximate (3.5) by12m2 1Xkm2=0 � � � 1Xk1=0 �1Xj=0 2(y; fj z2 + (k1; :::; km2)2 g);where is an odd number. The braces indicate that each component of thevector is to be replaced by its fractional part: that is f!g = ! � [!], where[!] denoting the largest integer which does not exceed. A \good" latticerule depends a good choice of vector z. Very recently Joe and Sloan (privatecommunication) have proposed a table recommended choices of z. For theuse in this paper, we quote a part of the table. See Appendix A.Let n = 20; r = 8; m = 4; m1 = 2: Matrices A 2 Rr�n; b 2 Rr; c 2Rn; T 2 Rm�n; q 2 Rm1 ;W 2 Rm1�m; P 2 Rn�n (with rank(P ) = n � 1)and H 2 Rm�m are randomly selected. We consider the problem (1.2)with integral dimension m2 = 2 and m2 = 3; respectively. We chooseh(!) = (!1; !2; 12:85; 12:85) for m2 = 2 and h(!) = (!1; !2; !3; 12:85) form2 = 3:We use the same data to compare the use of the simple Monte-Carlomethod and the lattice method in Algorithm 3.1. We test jj~x��xNk jj, jj�(~x�)��(xNk )jj, jjF (�k)jj, computational time and iterations with di�erent N , wherexNk is an approximate solution of (1.3) obtained by Algorithm 3.1, k is the �rstiteration which satis�es the convergence criterion, and ~x� is an approximatesolution of (1.2) obtained by the lattice method with N = 40028 (form2 = 2)and N = 40024 (for m2 = 3). The numerical results with convergencecriterion jjF (�k)jj � 10�7 are shown in Table 2 and Table 3.20
Table 2: jj~x� � xNk jj and jj�(~x�)� �(xNk )jjm2 = 2 Monte-Carlo method Lattice methodN jj~x� � xNk jj jj�(~x�)� �(xNk )jj jj~x� � xNk jj jj�(~x�)� �(xNk )jj4996 4:3317� 10�2 7:6913� 10�1 5:2780� 10�11 3:9790� 10�1310012 2:8168� 10�2 8:2472� 10�1 4:5340� 10�11 7:3896� 10�1320012 1:8876� 10�2 3:3786� 10�1 4:2611� 10�11 2:5011� 10�1240028 2:8166� 10�2 4:7179� 10�1 4:8320� 10�11 8:2423� 10�13m2 = 3 Monte-Carlo method Lattice methodN jj~x� � xNk jj jj�(~x�)� �(xNk )jj jj~x� � xNk jj jj�(~x�)� �(xNk )jj4952 3:0011� 10�2 1.0059 2:2733� 10�9 6:8212� 10�139992 1:1469� 10�2 4:2687� 10�1 4:2307� 10�5 1:3511� 10�320024 8:8562� 10�3 1:7231� 10�1 4:1973� 10�5 1:3445� 10�340024 9:1314� 10�3 3:5135� 10�1 4:1938� 10�5 1:3434� 10�3Table 3: Iterations k, jjF (xk)jj and computational timem2 = 2 Monte-Carlo method Lattice methodN k(jjF (�Nk )jj) Time k(jjF (�Nk )jj) Time4996 6(4:4466� 10�8) 4:5356� 102 6(2:2948� 10�9) 6:4168� 10210012 6(4:5627� 10�10) 9:3588� 102 6(1:9620� 10�9) 1:5057� 10320012 6(9:6096� 10�9) 1:8136� 103 6(1:8602� 10�9) 2:5447� 10340028 6(1:2797� 10�8 ) 3:5858� 103 6(2:1092� 10�9) 5:2971� 103m2 = 3 Monte-Carlo method Lattice methodN k(jjF (�Nk )jj) Time k(jjF (�Nk )jj Time4952 7(8:4431� 10�8) 6:7217� 102 7(6:7147� 10�8) 7:6836� 1029992 7(8:1951� 10�8) 1:2207� 103 7(5:5788� 10�8) 1:5347� 10320024 7(7:2925� 10�8) 2:5222� 103 7(6:3611� 10�8 ) 3:0608� 10340024 7(5:6547� 10�8) 4:8433� 103 7(6:2545� 10�8) 6:3041� 10321
AcknowledgemetsThe authors acknowledge discussions with J.S. Pang and D. Ralph on theprojection function and the algorithm. The authors wish to thank S. Joe,T. Langtry and I.H. Sloan for discussions on multidimensional numerical in-tegration. The authors are also thankful to A.J. King and R.J.-B. Wets fortheir preprint.
22
References1. J.R. Birge and R.J.-B. Wets, \Designing approximation schemes forstochastic optimization problems, in particular, for stochastic programswith recourse," Mathematical Programming Study 27(1986) 54-102.2. P. Davis and P. Rabinowitz, \Methods of Numerical Integration" (Aca-demic Press, New York, 1984).3. R. Fletcher, \Practical Methods of Optimization" (Second Edition, JohnWiley & Sons, Ltd, 1987).4. S. Joe and I.H. Sloan, \Imbedded lattice rules for multidimensionalintegration," SIAM J. Numerical Analysis 29(1992) 1119-1135.5. P. Kall, \Stochastic programming - An introduction," in: the SixthInternational Conference on Stochastic Programming, Italy, 1992.6. Y.M. Kaniovski, A.J. King and R.J.-B. Wets, \Probabilistic bounds(via large deviations) for the solutions of stochastic programming prob-lems," preprint (1993).7. L. Nazareth and R.J.-B. Wets, \Algorithms for stochastic programs:The case of nonstochastic tenders," Mathematical Programming Study28(1986) 1-28.8. L. Nazareth and R.J.-B. Wets, \Nonlinear programming techniquesapplied to stochastic programs with recourse," in: Y. Ermoliev and R.Wets, eds., Numerical Techniques in Stochastic Programming (Spring-Verlag, Berlin, 1988)pp. 95-119.9. H. Niederreiter, \Multidimensional numerical integration using pseudorandom numbers," Mathematical Programming Study 27(1986) 17-38.10. J.S. Pang, \Newton's method for B-di�erentiable equations," Mathe-matics of Operations Research 15(1990) 311-341.11. J.S. Pang and L. Qi, \A globally convergent Newton method for convexSC1 minimization problems", Applied Mathematics Report AMR93/3,School of Mathematics, the University of New South Wales, Sydney,Australia (1993). 23
12. A. Prekopa and R.J.-B. Wets, \Preface of Stochastic Programming 84,"Mathematical Programming Study 28(1986).13. L. Qi, \Convergence analysis of some algorithms for solving nonsmoothequations," Mathematics of Operations Research 18(1993) 227-244.14. L. Qi, \Superlinear convergent approximate Newton methods for LC1optimization problems," to appear in Mathematical Programming.15. L. Qi and R. Womersley, \An SQP algorithm for extended linear-quadratic problems in stochastic programming," Applied Mathemat-ics Preprint AM92/23, School of Mathematics, the University of NewSouth Wales, Sydney, Australia (1992).16. S.M. Robinson, \An implicit-function theorem for a class of nonsmoothequations," Mathematics of Operations Research 16(1991) 292-309.17. R.T. Rockafellar, Convex Analysis, (Princeton University Press, Prince-ton, NJ, 1970).18. R.T. Rockafellar, \Computational schemes for solving large-scale prob-lems in extended linear-quadratic programming," Mathematical Pro-gramming 48(1990) 447-474.19. R.T. Rockafellar and R.J.-B. Wets, \A Lagrangian �nite-generationtechnique for solving linear-quadratic problems in stochastic program-ming," Mathematical Programming Study 28(1986) 63-93.20. R.T. Rockafellar and R.J.-B. Wets, \Linear-quadratic problems withstochastic penalties: the �nite generation algorithm," in: V.I. Arkin,A. Shiraev and R.J-B. Wets, eds., Stochastic Optimization (LectureNotes in Control and Information Sciences 81, Springer-Verlag, Berlin,1987) pp. 545-560.21. I.H. Sloan, \Numerical integration in high dimensions - the lattice ruleapproach," in: T.O. Espelid and A. Genz eds., Numerical Integration(Kluwer, 1992)pp. 55-69.22. P. Tseng, \Applications of a splitting algorithm to decomposition inconvex programming and variational inequalities," SIAM J. Controland Optimization 29(1991) 119-138.24
23. R.J.B. Wets, \Stochastic programming: Solution techniques and ap-proximation schemes," in: A. Bachem, M. Gr�otschel and B. Korteeds., Mathematical Programming, The State of the Art - Bonn 1982(Springer-Verlag, Berlin, 1983)pp. 566-603.24. C. Zhu and R.T. Rockafellar, \Primal-dual projected gradient algo-rithms for extended linear-quadratic programming," to apper in SIAMJ. Optimization.
25
Appendix A: Tables of the vector zHere we give tables of the vector z obtained by Joe and Sloan. These tableswere used in the numerical experiments.Table 4: Recommended choices of the vector z for m2 = 2 2m2 z1249 4996 (512, 1)2503 10012 (672, 1)5003 20012 (1, 1850)10007 40028 (1, 3822)Table 5: Recommended choices of the vector z for m2 = 3 2m2 z619 4952 (233, 436, 1)1249 9992 (1010, 136, 1)2503 20024 (1, 1868, 1025)5003 40024 (1, 2271, 1476)
26