Multicriteria optimization with a multiobjective golden section line search

Math. Program., Ser. A (2012) 131:131–161DOI 10.1007/s10107-010-0347-9

FULL LENGTH PAPER

Multicriteria optimization with a multiobjective goldensection line search

Douglas A. G. Vieira · Ricardo H. C. Takahashi ·Rodney R. Saldanha

Received: 6 November 2006 / Accepted: 3 March 2010 / Published online: 17 April 2010© Springer and Mathematical Programming Society 2010

Abstract This work presents an algorithm for multiobjective optimization that isstructured as: (i) a descent direction is calculated, within the cone of descent and fea-sible directions, and (ii) a multiobjective line search is conducted over such direction,with a new multiobjective golden section segment partitioning scheme that directlyfinds line-constrained efficient points that dominate the current one. This multiobjec-tive line search procedure exploits the structure of the line-constrained efficient set,presenting a faster compression rate of the search segment than single-objective goldensection line search. The proposed multiobjective optimization algorithm converges topoints that satisfy the Kuhn-Tucker first-order necessary conditions for efficiency (thePareto-critical points). Numerical results on two antenna design problems support theconclusion that the proposed method can solve robustly difficult nonlinear multiob-jective problems defined in terms of computationally expensive black-box objectivefunctions.

Keywords Multiobjective optimization · Feasible directions · Line search ·Golden section

Mathematics Subject Classification (2000) 90C29 · 90C30 · 65K05

D. A. G. Vieira · R. R. SaldanhaDepartamento de Engenharia Elétrica, Universidade Federal de Minas Gerais,Minas Gerais, Brazil

R. H. C. Takahashi (B)Departamento de Matemática, Universidade Federal de Minas Gerais, Minas Gerais, Brazile-mail: [email protected]

123

132 D. A. G. Vieira et al.

1 Introduction

At the end of the XIX century, the study of economic equilibrium phenomena sug-gested the idea of the simultaneous maximization of several functions, which lead tothe concept of efficient solutions. In the words of Vilfredo Pareto [24]:

“We will say that the members of a collectivity enjoy maximum ophelimity1 in acertain position when it is impossible to find a way of moving from that positionvery slightly in such a manner that the ophelimity enjoyed by each of the indi-viduals of that collectivity increases. That is to say, any small displacement indeparting from that position necessarily has the effect of increasing the ophel-imity which certain individuals enjoy, and decreasing that which others enjoy,of being agreeable to some, and disagreeable to others.”

This idea, transposed to the context-free general setting of finding the efficient solu-tions of minimization or maximization problems with conflicting objective functionshas given rise to the multiobjective optimization [4]. A further generalization, intro-duced by Yu [32], based on the observation of an equivalence between partial ordersand certain convex cones, has given rise to the vector optimization [10]. In the contem-porary terms, a multiobjective optimization problem is a vector optimization problemin which the partial order is induced by the cone of the non-negative orthant. This paperdeals with an instance of the problem of finding the efficient solutions of multiobjectiveoptimization problems (the Pareto-optimal solutions).

This general problem can be instantiated in a number of different formulations,depending on the purpose of the analysis in each case. A group of approaches thatcan be considered “classical” rely on some scalarization procedure, i.e., they definean equivalent single objective problem leading to an efficient solution [3,4,9,27,31].These techniques are employed usually in two situations:

(i) A single efficient solution should be generated, in order to be implemented. In thiscase, the scalarization procedure should provide either the possibility of definingsome “goal” for guiding the optimization procedure, or some indication of the“relative importances” of the objectives in order to generate a meaningful solu-tion. As an instance of the first case, there is the goal programming procedure,and of the second case, the weighted sum procedure [4].

(ii) A solution set should be generated, in order to provide a description of the Pareto-set. The main concerns, for such applications, are: the ability of the algorithmto generate any point within the Pareto-set (the weighted sum procedure, forinstance, is unable to generate samples in some regions of the Pareto-set whenthe problem is non-convex); and the easiness of manipulation of the algorithmparameters in order to generate an approximately regular sample of the Pareto-set. The epsilon-constraint method, for instance, is likely to conduct a largenumber of unfeasible searches when the number of objectives is greater thantwo, while the goal programming method is likely to generate points that are

1 V. Pareto employed the term ophelimity instead of utility, in order to avoid the common-sense meaningof this last word that could be in conflict with its economic meaning.

123

Multicriteria optimization with a multiobjective golden section line search 133

more concentrated in some regions than in other ones when the goal vector ismoved uniformly.

In any case, with the exception of the approach of weighted sum of the objectives, thescalarized problems lead to scalar optimization problems that are more complicatedthan the individual optimization of the original objectives. For instance, the constraintstructure of the resulting single objective optimization problem becomes usually morecomplicated than in the original multiobjective optimization problem [3,4,9,27,31].

In recent years, it has been recognized that the availability of a sample set esti-mate that describes the whole Pareto set, as obtained in situation (ii) above, can beimportant for the development of computer aided design and computer aided decisionsystems. However, it has also become established that the application of scalarizationtechniques for the task of generating such sample set is inconvenient due to two rea-sons. First, the design of the sequence of scalar optimization problems with differentparameterizations, such that the Pareto-set is covered with a well-distributed sampledensity, is not a trivial task (see a discussion in [11]). Second, the scalarization pro-cedures are intrinsically more computationally costly than would be necessary, sincethey do not take profit of the peculiar structure of the multiobjective problems.

An important class of algorithms for dealing with such problem is the set of evolu-tionary multiobjective optimization techniques. Such algorithms, which perform ran-domized searches, are characterized by the simultaneous processing of an entire set oftentative solutions (a “population”). This parallel search allows the usage of the infor-mation about the partial order relation between such tentative solutions, for enhancingthe search for the non-dominated solutions, and also the usage of the relative distancesbetween solutions, for enhancing the solution distribution along the Pareto-set. Thefirst approach that fits this description has been presented in [14,15]. Two algorithmsthat are currently part of the state-of-the-art are described in [7,33]. A comprehensivediscussion on the topic can be found in [5,6]. Recent studies in the field include, forinstance, the choice of strategies for archiving the Pareto-set estimates found by thealgorithm [29].

More recently, some hybrid approaches have been proposed. The reference [2] haspresented an evolutionary algorithm that employs local searches using gradient infor-mation. The descent cone is reformulated such that for every direction in image spacean associated descent direction in parameter space can be computed, allowing thedetermination of directions for line searches that reach points that locally dominate acurrent solution. The paper [30] has employed a scalarization procedure over surrogateapproximated functions, for the purpose of correcting solutions within an evolutionarymultiobjective algorithm, in this way enhancing both the solution precision and thealgorithm computational effort.

Another randomized search approach has been proposed recently, based on the ideaof constructing stochastic differential equations which have the efficient solution setsas attractors [28]. A set of solutions is obtained in a single run of the algorithm. A simi-lar idea has been developed in [8], with the construction of a discrete dynamical systemfor which the efficient solution set is an attractor. Set-oriented numerical methods areemployed, coupled with a branch-and-bound procedure, in order to generate a tightbox-covering of the Pareto-set. The reference [19] presents a procedure for generating

123


polyhedral inner and outer approximations of the Pareto-set. This algorithm, whichrelies on a sequence of scalarized sub-problems, has an auto-adaptive mechanism thatautomatically balances the distribution of the Pareto-set samples. A related approach,based on the numeric resolution of a differential equation using continuation meth-ods, is presented in [25]. Yet another approach, based on homotopy methods, has beenproposed for the special case of equality-constrained problems [17].

There is yet another situation, however, in which Pareto-optimal points should begenerated:

(iii) A non-efficient solution is already available, and a non-dominated solution thatdominates the current solution should be generated.

This situation may appear, for instance, as a formal statement of the problem of enhanc-ing the characteristics of an existing technological apparatus. Other contexts for thesituation (iii) arise as sub-problems within algorithms of efficient solution generation,in the case of the algorithm outer loop being able to find only approximate non-dom-inated solutions that should be “corrected” in order to find the final solutions. Thismay be the case of evolutionary algorithms, as in [2,30].

The first versions of algorithms for situation (iii) were developed within the frame-work of scalarization methods, see for instance [1]. A key observation about situation(iii) is: since there is no need to find a specific point of the Pareto set, the multiplicityof the possible solutions can be used in order to tail more efficient search proceduresthat find any non-dominated solution that dominates the initial point. This peculiarstructure of the multiobjective optimization problems is not used by the scalarizationmethods, that perform searches for finding specific points of the Pareto-set.

The paper [13] seems to be the first one that employs such problem structure. Ithas proposed a line search method for multiobjective problems that does not rely ona scalarization procedure. That work follows the reasoning: (i) a direction in whichthere are feasible solutions that dominate the current point is chosen; (ii) a line searchbased on the Armijo’s rule is conducted on this direction, in order to find some pointthat dominates the current one. That work shows that such procedure converges to aPareto-critical point (a point that satisfies first-order conditions for Pareto-optimality,see [4,9]). Although no comparison has been performed in [13], it is shown here thatsuch procedure allows a significant reduction in the computational cost of generat-ing non-dominated solutions, in comparison with traditional scalarization methods.A generalization of those ideas, presenting a Newton’s method for the computation ofthe search direction, has been proposed recently in [12].

This paper further exploits the same principle, in order to enhance the computationalgains that are achieved. A golden section interval partitioning procedure is proposedhere, instead of the Armijo’s rule, for performing the multiobjective line search. Thefinal point that is delivered by the proposed procedure is Pareto-critical for the one-dimensional multiobjective problem obtained when constraining the original problemto the line of search: this is a stronger property than the one that is assured by theArmijo’s rule employed in [13], which guarantees only that the final point obtainedby the line search dominates the current one.

In the field of single-objective non-linear programming, the construction of line-search-based methods involves a choice of a line search procedure. The Armijo’s rule

123


and the golden section procedure are situated among the main available alternatives,and the choice between them involves a trade-off between the fast convergence of theArmijo’s rule procedure to a point that is better than the current one (also providingthat it is situated reasonably “far from” the current one) and the precise determinationof the line-constrained minimum of the objective function provided by the goldensection procedure.

The single-objective golden section line search procedure is known to be optimal,among the derivative-free and function-approximation-free line search procedures, inthe sense that it requires the minimal number of function evaluations for finding theline-constrained function minimum within a pre-defined ε-tolerance [23]. The goldensection multiobjective line search procedure proposed here not only inherits this opti-mality of the segment compression rate of the single-objective golden section proce-dure: it leads to a faster interval compressing rate than in the single-objective case,because the multiobjective problem allows simultaneously discarding two segmentparts in some steps. Also, as the constrained Pareto-set over a line is a non-zero mea-sure segment (except in very rare situations), the proposed procedure can be adjustedto find such Pareto-critical points in a number of steps that is not only finite, butusually small. Due to these reasons, the trade-off mentioned above between the fastconvergence of the Armijo’s rule and the precision of the golden section procedurebecomes more favorable to the golden section method in the multiobjective case thanin the single-objective case.

In addition to the new golden section multiobjective line search procedure, thispaper also proposes a modified procedure for finding a direction for performing thesearch. Employing the same basic procedure of [13] for guaranteeing that the directionis descent, such direction is also constrained to lie inside the cone of convex combina-tions of the negative gradients of the objective functions and of the active constraints.This has the advantage of directing the search toward the central portion of the coneof locally dominating solutions.

The proposed procedures are joined into a method that is somewhat simple, relyingonly on (i) the computation of function and constraint gradients, followed by (ii) thesolution of an auxiliary linear optimization, for finding the search direction, and by (iii)a golden-section multiobjective line search. The problem is solved with the originalstructure of constraints and objectives. This method has the same basic structure of theone proposed in [13]. The convergence of this method to a first-order Pareto-criticalpoint can be established in the same way as in that reference.

The paper is structured as follows. Firstly, the multiobjective golden section linesearch is presented. Afterward, a procedure for selecting a descent direction for linesearch is established. Using these procedures, an algorithm with monotonic converge tothe Pareto-critical solution set is built. Some results for real-world engineering designproblems are finally presented to highlight the usability of the proposed approach.

Some notations employed here are: X denotes the complement of the binary vari-able X ∈ {0, 1}; X ·Y denotes the and operation between the binary variables X, Y ∈{0, 1}; (·)′ denotes the transpose of the matrix argument. The following line segmentsbounded by the points a ∈ R

n and b ∈ Rn , are denoted as:

[a, b] � {x |x = λa + (1− λ)b, λ ∈ [0, 1]}

123


(a, b) � {x |x = λa + (1− λ)b, λ ∈ (0, 1)}(a, b] � {x |x = λa + (1− λ)b, λ ∈ (0, 1]}[a, b) � {x |x = λa + (1− λ)b, λ ∈ [0, 1)}

2 Preliminary definitions

Let B(·, ·) denote a binary relation in Rm defined by the set B ⊂ R

m×Rm . This means

that for any x1 ∈ Rm and x2 ∈ R

m, B (x1, x2) is true if and only if (x1, x2) ∈ B. Abinary relation that satisfies the following properties is called a partial order:

(i) Transitivity: B (x1, x2) and B(x2, x3) imply B(x1, x3);(ii) Reflexivity: B(x, x) for any x ∈ R

m ;(iii) Antisymmetry: B (x1, x2) and B(x2, x1) imply x1 = x2.

A partial order B(·, ·) is linear if, in addition, it satisfies:

(iv) B (x1, x2) and t ≥ 0 imply B (t x1, t x2);(v) B (x1, x2) and B (x3, x4) imply B (x1 + x3, x2 + x4).

The following characterization for a linear partial order is useful:

Proposition 1 Suppose that B(·, ·) is a linear partial order in Rm. Then the set

C0 �{

x ∈ Rm |B(x, 0)

}

is a convex and pointed cone. Conversely, if C ⊆ Rm is a convex and pointed cone,

then the relation B(·, ·) defined by

B (x1, x2)⇔ (x1 − x2) ∈ C

is a linear partial order in Rm.

Proof See [18]. �A point x1 ∈ R

m is a predecessor of x2 ∈ Rm if x1 = x2 and B (x1, x2) is true. The

concept of minimum, which is defined ordinarily for sets of scalars, can be generalizedfor sets of vectors by using partial orders. A vector x∗ ∈ � ⊂ R

m is a minimal elementof � if ∃x ∈ �, x = x∗, such that B (x, x∗), i.e., x∗ has no predecessor in � w.r.t. thepartial order B(·, ·). Differently from the scalar case, a set � ⊂ R

m can have multipledifferent minimal elements w.r.t. a given partial order B(·, ·).

The optimization with multiple criteria deals with the problem of minimization of avector of objective functions, and can be stated using any partial order for defining theinstances of the objective vector which constitute minimal elements of the image ofthe feasible set. If the partial order is linear, the problem is called a vector optimizationproblem (VOP). In the particular case of a vector optimization problem in which thepartial order is associated to the cone of the non-negative combinations of the basisvectors of the coordinate system (the columns of the identity matrix I), the problembecomes a Multiobjective Optimization Problem (MOP) – which is studied in thispaper.

123


In order to establish a compact notation to deal with MOPs, the relational operators<, = and ≤ are defined for vectors w, v ∈ R

m as:

w < v ⇔ wi < vi , ∀ i = 1, . . . , m

w ≤ v ⇔ wi ≤ vi , ∀ i = 1, . . . , m (1)

w = v ⇔ ∃ j such that w j = v j

It should be noticed that the operator ≤ defined in this way is the linear partial orderinduced by the cone of positive combinations of the columns of I. The operator ≺ isdefined, for the same vectors, as:

w≺v ⇔ (wi ≤ vi∀i=1, . . . , m) and(∃ j ∈ {1, . . . , m} such that w j < v j

)(2)

The relation w ≺ v, which means that w is a predecessor of v in the partial order ≤,is read as w dominates v. The relation v � w means the same as w ≺ v.

Consider the multiobjective optimization problem (MOP) defined by the min-imization (w.r.t. the partial order ≤) of a vector of objective functions F(x) =(F1(x), F2(x), . . . , Fm(x)):

min F(x)

subject to x ∈ � (3)

where Fi (x) : Rn �→ R are differentiable functions, for i = 1, . . . , m, and � ⊂ Rn is

the feasible set, defined by

� �{

x ∈ Rn|g(x) ≤ 0

}, (4)

with g(·) : Rn �→ Rp a vector of differentiable functions. Associated to the minimi-

zation of F(·), the efficient solution set, �∗, is defined as:

�∗ �{

x∗ ∈ �| ∃x ∈ � such that F(x) ≺ F(x∗

)}(5)

The multiobjective optimization problem is defined as the problem of finding vectorsx∗ ∈ �∗. This set of solutions is also called the Pareto-set of the problem.

3 Problem statement

This paper is concerned with the problem of finding vectors that satisfy certain con-ditions for belonging to �∗. The following matrices are defined:

⎧⎪⎨

⎪⎩

H(x) = [∇F1(x) ∇F2(x) . . . ∇Fm(x)]

G(x) = [∇gJ (1)(x) ∇gJ (2)(x) . . . ∇gJ (r)(x)]

W (x) = [H(x) G(x)]

(6)

in which J denotes the set of indices of the active constraints, with r elements. Then,gi (x) = 0⇔ i ∈ J .

123


The linearized feasible cone at a point x , denoted by G(x), is defined as:

G(x) �{ω ∈ R

n|G ′(x) · ω ≤ 0}

(7)

Given x ∈ �, a vector ω ∈ Rn is a tangent direction of � at x if there exists a sequence

[xk]k ⊂ � and a scalar η > 0 such that:

limk→∞ xk = x, and lim

k→∞ ηxk − x

‖xk − x‖ = ω (8)

The set of all tangent directions is called the contingent cone of � at x , and is denotedby T (�, x). In this paper, the following constraint qualification (see reference [16])is assumed to hold:

T (�, x) = G(x) (9)

Theorem 1 Consider the multiobjective optimization problem defined by (3) and (4),and assume that the constraint qualification (9) holds. Under such assumption, a nec-essary condition for x∗ ∈ �∗ is that there exist vectors λ ∈ R

m and μ ∈ Rr , with

λ � 0 and μ ≥ 0, such that:

H(x∗

) · λ+ G(x∗

) · μ = 0 (10)

This theorem is a matrix formulation of the Kuhn–Tucker necessary conditions forefficiency (KTE), that become also sufficient in the case of convex problems (see, forinstance, [4]). The points x∗ which satisfy the conditions of Theorem 1 are said tobe first-order Pareto-critical points. This paper is concerned with the search for suchpoints.

4 Golden section multiobjective line search

Consider, in first place, a version of problem (3) in which the feasible set � is containedin a line. This can be expressed by including the constraint

x = xk + αv (11)

in the set of problem constraints. In this case, the parameter α becomes the line searchoptimization variable. Before presenting a formal solution for this problem, a graphicalinterpretation is sketched.

4.1 Multiobjective line search graphical interpretation

A two-variable bi-objective example is employed for illustrating the issues a linesearch procedure must tackle. Figure 1 shows, in R

2, the contour curves of two qua-dratic functions with positive-definite Hessian matrices. xk is the current point and v

is the search direction vector. P1 and P3 are the constrained minima of F1 and F2,respectively, along the line xk + αv. The values of the objectives F1 and F2 over thisline are shown in Fig. 2. The Pareto-front (the image of the Pareto-set in the space ofobjectives) of this line-constrained problem is shown in Fig. 3.

123


0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−1

−0.5

0

0.5

1

1.5

2

2.5

3

Xk

P1

P2

P3

PO Front

Search Direction

Fig. 1 A bi-objective problem where the point xk and the direction v are defined. The aim of this problemis to optimize α

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

1

2

3

4

5

6

7

8

9

10f1f2

Xk

P1

P2

P3

Fig. 2 The values (F1(x), F2(x)) , x ∈ (xk + αv)

Using the information presented in Fig. 3, the following considerations are estab-lished. Firstly, all the points in the segment (xk, P1] dominate xk . Also, any pointx ∈ [xk, P1) is dominated by at least one point in the segment [P1, P2]. This means

123


0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

7

8

9

10

Xk

P1

P2

P3

Fig. 3 The image of the line x ∈ (xk + αv), in the objective space. The Pareto front constrained to thisline is represented by the points on the path from P1 to P3

that, even though all points x ∈ (xk, P1] dominate xk , it is more interesting to finda solution x ∈ [P1, P2]. The solutions x ∈ (P2, P3] do not dominate and are neitherdominated by xk although they belong to the one-dimensional constrained Pareto setof the problem of minimizing (F1(x), F2(x)) |x ∈ (xk + αv).

It should be noticed that the Armijo’s rule procedure, as presented in [13], wouldlead to any point in the segment (xk, P2]. The golden section line search procedure,as proposed here, is intended to deliver a point in the segment [P1, P2].

4.2 Some definitions

Given the vectors v and xk , consider the line segment parameterized in α, defined by:

{x ∈ �|x = xk + αv, 0 ≤ α ≤ 1} (12)

With this parametrization, the vector function F(x) : Rn �→ R

m is replaced byf (α) : R �→ R

m :

f (α) = F (xk + αv) (13)

The feasible set � constrained by (12) corresponds to the segment R defined by:

R � {α|α ∈ [0, 1]} (14)

123


The line search will be conducted within R. The multiobjective optimization problemconstrained to this line becomes the problem of finding R∗ such that:

R∗ �{α∗ ∈ R| ∃α ∈ R such that f (α) ≺ f

(α∗

)}(15)

Define the subsets Rd ⊆ R and R∗d ⊆ R∗ that contain the feasible solutions thatdominate the point α = 0 respectively in the set R and in the set R∗:

Rd � {α ∈ R| f (α) ≺ f (0)}R∗d � {α ∈ R∗| f (α) ≺ f (0)} (16)

Define, also, the closures of these sets, represented respectively by Rd and R∗d .The individual minima of the objective functions constrained to R are:

ρi = argα minα

fi (α)

subject to: α ∈ R (17)

Assume, provisionally, that the functions fi are strictly unimodal over R. This meansthat fi has monotonic behavior for both α < ρi and α > ρi :

fi (α1) < fi (α2)∀α1, α2 ∈ (ρi , 1], α1 < α2(18)

fi (α1) > fi (α2)∀α1, α2 ∈ [0, ρi ), α1 < α2

Lemma 1 below characterizes the structure of the efficient solution set R∗:

Lemma 1 Consider the multiobjective optimization problem defined by (15). Con-sider also the points ρi given by (17), and assume (provisionally) that the func-tions fi are unimodal over the line R, as stated in (18). Under these conditions,the efficient solution set R∗ is the smallest closed segment of R that contains the set{ρ1, ρ2, . . . , ρm}.Proof Define ρa as the smallest value in {ρ1, ρ2, . . . , ρm}, and ρb as the largest one.The unimodality of functions fi guarantees their monotonic increasing behavior inα ∈ (ρb, 1). This leads to the conclusion: f (ρb) ≺ f (α)∀α ∈ (ρb, 1]. The mono-tonic decreasing behavior of all functions fi in the segment α ∈ (0, ρa) similarlyleads to f (ρa) ≺ f (α)∀α ∈ [0, ρa). Therefore, R∗ ⊂ [ρa, ρb]. Consider now anyα ∈ [ρa, ρb]. All points over the segment [ρa, α) have value of function fb greater thanfb(α), due to the monotonic decreasing behavior of fb in [ρa, ρb), what means that α

is not dominated by any point in such interval. Using the same reasoning, employingthe monotonic increasing behavior of fa in the segment (α, ρb], it can be shown thatα is not dominated by any point in this interval too. Therefore, R∗ = [ρa, ρb]. �

It should be noticed that quasi-convex functions, for instance, satisfy the assump-tion of unimodality over any line. Lemma 2 constitutes the basis for building a linesearch procedure for finding a point in the set R∗:

123


Lemma 2 Consider α1, α2 ∈ R, such that: 0 < α1 < α2 < 1. Then:

(i)

{f (α1) ≺ f (α2)⇒ R∗ ∩ [α2, 1] = ∅f (α2) ≺ f (α1)⇒ R∗ ∩ [0, α1] = ∅

(ii) f (α1) ≺ f (α2) and f (α2) ≺ f (α1)⇒ R∗ ∩ [α1, α2] = ∅

Proof

(i) This comes directly from Lemma 1.(ii) Defining ρa and ρb as in the proof of Lemma 1; it comes that R∗ = [ρa, ρb].

Consider α1 > ρb. This means that f (α1) ≺ f (α2), due to the monotonicincreasing of all functions fi in the interval (ρb, 1). In such case, [α1, α2] ∩[ρa, ρb] = ∅. Adding the same reasoning for the case of α2 < ρa , the conclusionis that f (α1) ≺ f (α2) and f (α2) ≺ f (α1) implies that either α1 ∈ [ρa, ρb]or α2 ∈ [ρa, ρb] must occur, q.e.d. �

Lemma 3 presents a useful characterization of the set Rd . This characterizationidentifies its closure, the set Rd .

Lemma 3 Let αd be the supremum of the set of the scalars α ∈ (0, 1] such thatf (α) ≺ f (0). Then: αd ≥ ρa and Rd = [0, αd ].Proof Any point α ∈ (0, ρa] is such that f (α) ≺ f (0), due to the monotonic decreas-ing behavior of all functions fi in the interval (0, ρa). Therefore:

(i) (0, ρa] ⊂ Rd ,

and the scalar αd must be such that αd ≥ ρa .Define now a point of transition α as a point for which holds:

⎧⎨

⎩

α ∈ [ρa, 1]∃ε > 0 such that:

{f (α − δ) ≺ f (0) ∀δ ∈ (0, ε)

f (α + δ) ≺ f (0) ∀δ ∈ (0, ε)

Notice that at least one such α exists if αd < 1, since such αd satisfies such relations.Associated to any point of transition α, there is a set of indices I ⊂ {1, . . . , m} such

that the functions fi , with i ∈ I, satisfy fi (α − δ) < fi (0) and fi (α + δ) > fi (0),for all δ ∈ (0, ε). The other functions f j , with j ∈ {1, . . . , m} and j ∈ I are suchthat f j (α − δ) < f j (0) and f j (α + δ) ≤ f j (0). Therefore, the functions fi withi ∈ I are increasing in the point α = α. Let ρi = maxi∈I(ρi ). Since any func-tion fi is monotonically increasing for α ∈ (ρi , 1) and monotonically decreasing forα ∈ (0, ρi ), it can be concluded that, if α exists, it is such that α ∈ (ρi , 1). Therefore,fi (α) > fi (α) ≥ fi (0) for all α ∈ (α, 1]. This leads to the conclusion that there isnot an α ∈ (α, 1] such that f (α) ≺ f (0). Therefore, if α exists:

(ii) αd = α;(iii) The same argument leads to the impossibility of existence of another point of

transition smaller than αd .

123


Notice that if αd = 1, the results above imply that there is no point of transition in theinterval (0, αd ]. Joining (i), (ii) and (iii), it becomes established that Rd = [0, αd ].

�These definitions are illustrated, transposed back to the R

n space, in Figs. 1, 2 and 3.The line segment R represents the direction in which the search should be held. Thesegment R∗, which is the segment [P1, P3] in this case, contains the efficient solutionsof the multiobjective problem constrained to the segment R. The segment (xk, P2]contains all the points that dominate xk , thus, it is Rd . The segment R∗d , in this case,becomes [P1, P2].

4.3 The multiobjective line search algorithm

Consider the points α = 0 and α = 1, such that R = [0, 1]. Consider also thepoints α = ρa and α = ρb such that R∗ = [ρa, ρb], and the point α = αd such thatRd = [0, αd ]. The combination of Lemmas 1 and 3 leads to the fact that these fivepoints must lie in R either according the ordering [0, ρa, αd , ρb, 1] (as in Figs. 1, 2and 3) or according the ordering [0, ρa, ρb, αd , 1]. Note that in the first case holdsR∗d = [ρa, αd ], while in the second case holds R∗d = [ρa, ρb].

The possible relative positions of two other points αA ∈ (0, 1) and αB ∈ (αA, 1)

to these five points are shown in Fig. 4. It is assumed here that αA, αB /∈ {ρa, ρb, αd}.The information that is available to be used in a line search procedure is the set of

evaluations of the vector function f (·) in points α = 0, α = αA and α = αB . Forthe purpose of space saving, denote: f0 = f (0), f A = f (αA) and fB = f (αB). Thevector function C(·, ·, ·) : Rm×3 �→ {0, 1}6 of the vector comparison results is definedas:

C( f0, f A, fB) = [C1( f0, f A, fB) . . . C6( f0, f A, fB)]

= [( f0� f A) ( f0� fB) ( f A � fB) ( f0 ≺ f A) ( f0 ≺ fB) ( f A ≺ fB)]

(19)

with each Ci a binary number, with 1 meaning “true” and 0 meaning “false”for the result of each comparison. For instance f0 � f A makes C1 = 1, otherwise,C1 = 0( f0 � f A).

The application of lemmas 2 and 3 to the instances listed in Fig. 4 leads to thetruth-table for the vector function C( f0, f A, fB) that is shown in Table 1, where Xmeans “sometimes true and sometimes false”. For instance, in situa-tion II sometimes f A � fB , sometimes not. This table can be used for the purposeof determining what are the possible relative dispositions of the points αA and αB inrelation to the points ρa, ρb and αd , whose locations are not known. By matching thebinary vector C with one or more rows of this table, the possible relative positions ofthe points are determined.

There are three possible operations for contracting the current trust-segment (i.e.,the segment in which it is known that there exists some point that belongs to R∗) ina line search that is based on the function evaluation on points α = 0, α = αA and

123


Fig. 4 The 20 possible positions of the points αA and αB in relation to [0, ρa , αd , ρd , 1] (upper figure)and [0, ρa , ρd , αd , 1] (lower figure). Note that in the first case R∗d = [ρa , αd ], while in the second case

R∗d = [ρa , ρb] (which means that R∗d is the second segment in both cases). Note also that there are only14 distinct cases, since there are 6 cases that are equivalent in the upper and lower figures

α = αB . Let a current trust-segment be [α1, α2] ⊂ [0, 1]. These operations are namedD1, D2 and D3, as indicated in Table 2.

In order to find some point inside the segment R∗d , a maximal contraction of thetrust-segment should be performed at each step of the line-search algorithm, whilekeeping a non-null intersection between the current trust-segment and R∗d . In Table 3,each contraction operation is associated to a set of cases in which such operation couldbe applied without leading to the possibility of loss of the solution. The cases that arenot listed as indicated for the operation should be interpreted as “forbidden cases” forthat operation, that is, the application of the operation in such cases would lead to theloss of solution, i.e, discarding R∗d .

If, at each moment, the information about what is the correct relative positionbetween the points αA, αB, ρa, ρb and αd were available, the operations listed inTable 3 would indicate the maximal contraction policy for each case. However, ascan be seen in Table 1, there are situations in which the information of the binary

123


Table 1 The truth-table forcomparison betweenf (0), f (αA) and f (αB ),for the cases listed in Fig. 4

In the table, 1 means “true”, 0means “false” and X means“sometimes true andsometimes false”. Notethat: f0 = f (0), f A = f (αA)

and fB = f (αB )

C1 C2 C3 C4 C5 C6

f0 � f A f0 � fB f A � fB f0 ≺ f A f0 ≺ fB f A ≺ fB

I 1 1 1 0 0 0

II 1 1 X 0 0 0

III 1 0 X 0 0 0

IV 1 0 X 0 X X

V 1 1 0 0 0 0

VI 1 0 0 0 0 0

VII 1 0 0 0 X X

VIII 0 0 0 0 0 0

IX 0 0 0 0 X X

X 0 0 0 X X 1

XI 1 1 1 0 0 0

XII 1 1 X 0 0 0

XIII 1 1 X 0 0 X

XIV 1 0 X 0 X X

XV 1 1 0 0 0 0

XVI 1 1 0 0 0 X

XVII 1 0 0 0 X X

XVIII 1 1 0 0 0 1

XIX 1 0 0 0 X 1

XX 0 0 0 X X 1

Table 2 The contractionoperations on the trust segment[α1, α2] ⊂ [0, 1]

Contraction operation

D1 Discard [0, αA] ∩ [α1, α2]D2 Discard [αA, αB ] ∩ [α1, α2]D3 Discard [αB , 1] ∩ [α1, α2]

Table 3 The operations thatwould perform the maximalcontraction of the trust region ineach case, if the information fordetecting what is the currentcase were available

Contraction operation Cases

D1 Cases I–VII and XI–XVII

D2 All cases except III, IV and XIII, XIV

D3 All cases except I and XI

vector C (concerning the comparisons between f A, fB and f0) leads to the undecid-ability between some cases of possible point relative positions. For instance, whenC = [ 1 1 1 0 0 0 ], the relative positions can be either: I, II, XI, XII or XIII. Thecontraction operation to be performed, therefore, must be compatible with all possi-ble relative positions that can underly the value of the binary vector C that has been

123


Table 4 The operations thatwould perform the maximalcontraction of the trust region ineach case

Condition Operation

C1 · C6 D1

C1 D2

C1 · C2 · C3 · C4 · C5 · C6 D3

Table 5 Association of casesI–XX to the trust-segmentcontraction operations D1–D3,that is obtained by theapplication of the decision tableof Table 4

Case Always on Sometimes on

I D1

II D1 D3

III D1, D3

IV D3 D1

V D1, D3

VI D1, D3

VII D3 D1

VIII D2, D3

IX D2, D3

X D2, D3

XI D1

XII D1 D3

XIII D3 D1

XIV D3 D1

XV D1, D3

XVI D3 D1

XVII D3 D1

XVIII D3

XIX D3

XX D2, D3

obtained from the evaluation and comparison of f A, fB and f0. This means that anoperation that is incompatible with a case that is possible for the current C cannot beapplied. For the case of C = [ 1 1 1 0 0 0 ], the operation D2 cannot be applied, sinceit is “forbidden” for case XIII, although it is “allowed” for the cases I, II, XI and XII.

A decision table that “turns on” each contraction operation is shown in Table 4. Thisdecision table is built in order to maximize the matching of Table 3, using the vectorC( f0, f A, fB) through the truth-table of Table 1, without “turning on” an operation inany “forbidden case”. This decision table has been synthesized in order to deal withthe undecidability in C , choosing the maximal possible contraction that cannot causethe loss of solution.

The association of cases I–XX to the operations D1–D3, that is obtained by theapplication of the decision table of Table 4 is shown in Table 5. Table 5 shows that,under the decision prescribed by Table 4: (i) in all cases, at least one contraction oper-ation is performed on the trust segment; (ii) in 8 cases, two contraction operations are

123


always simultaneously applied; (iii) in 8 cases, two contraction operations are some-times simultaneously applied, although only one contraction operation is applied othertimes; (iv) in only 4 cases there is only one contraction operation being applied always.The procedure describing these operations, E(·, ·, ·), is shown on Algorithm 1.

Algorithm 1 Segment elimination algorithmInput: The set of objective function values f0, f A and fB .

Output: The binary numbers D1, D2 and D3.

1: function [D1, D2, D3] = E( f0, f A , fB )2: C1 ← ( f0 � f A)

3: C2 ← ( f0 � fB )

4: C3 ← ( f A � fB )

5: C4 ← ( f0 ≺ f A)

6: C5 ← ( f0 ≺ fB )

7: C6 ← ( f A ≺ fB )

8: D1 ← (C1 · C6)

9: D2 ← (C1)

10: D3 ← (C1 · C2 · C3 · C4 · C5 · C6)

11: end function

This section finishes with the Golden Section Multiobjective Line Search algorithm,Algorithm 2, that is intended to find a point α∗ ∈ R∗d . Of course, a Fibonacci searchscheme could be employed instead of the golden section one. The constant γ = 0.618,which is the inverse of the golden reason, is employed here for contracting the trustsegment.

For the purpose of making reference to Algorithm 2, define the function L(·, ·):

α∗ = L( f (·), ε) (20)

in which f (·) is the function to be minimized and ε is the tolerance on the resultprecision.

Theorem 2 The algorithm output α∗ = L( f (·), ε) is such that[α∗ − ε

2 , α∗ + ε2

] ∩R∗ = ∅, for a problem defined by (15). The algorithm stops in a number N of iterationsthat is bounded by: N ≤ log ε

log 0.618 .

Proof The proof of this Theorem is a straightforward result of Lemmas 3, 2 and 1,and the fact that at least one segment (usually more than one) is discarded at eachiteration, plus the fact that the compression rate is 0.618 when only one segment isdiscarded. �

Notice that the algorithm contracts the segment faster than the mono-objectivegolden section method, since there are some steps in which the contraction factorbecomes 0.236 (when two segments are discarded as shown in Table 5), at the costof two function evaluations, instead of 0.6182 = 0.382 for the conventional single-objective golden section algorithm.

123


Algorithm 2 Golden Section multiobjective line searchInput: f (·) : R �→ R

m , ε > 0

Output: α∗.1: function α∗ = L( f (·), ε)2: γ ← 0.6183: α1 ← 04: α2 ← 15: αA ← α1 + γ (α2 − α1)

6: αB ← α2 + γ (α1 − α2)

7: f0 ← f (0)

8: f A ← f (αA)

9: fB ← f (αB )

10: while ‖αA − αB‖ > ε do11: [D1, D2, D3] ← E( f0, f A , fB )

12: if (D2 · D3) then13: α2 ← αA14: αA ← α2 + γ (α1 − α2)

15: αB ← α1 + γ (α2 − α1)

16: f A ← f (αA)

17: fB ← f (αB )

18: else if (D1 · D3) then19: α2 ← αB20: α1 ← αA21: αA ← α2 + γ (α1 − α2)

22: αB ← α1 + γ (α2 − α1)

23: f A ← f (αA)

24: fB ← f (αB )

25: else if D1 then26: α1 ← αA27: αA ← αB28: αB ← α2 + γ (α1 − α2)

29: f A ← fB30: fB ← f (αB )

31: else if D3 then32: α2 ← αB33: αB ← αA34: αA ← α2 + γ (α1 − α2)

35: fB ← f A36: f A ← f (αA)

37: end if38: end while39: α∗ ← (αA + αB )/240: end function

NOTE 1 All the developments presented in this section have assumed that a feasibleend point (corresponding to α = 1) is readily available. If this is not the case, the issueof feasibility should be treated by the line search algorithm. This can be performed,assuming the convexity of feasible set �, by simply adding the following contractionoperations:

• αA infeasible: operations D2 and D3.• αB infeasible: operation D3.

123


NOTE 2 It is straightforward to conclude that, in the case of multimodal functions,the proposed line search algorithm may terminate in a point that is not locally Pareto-critical w.r.t. R. However, the algorithm output point still dominates the current point,and the algorithm also still terminates within a finite number of steps. This is sufficientfor establishing the conditions for the convergence of the multiobjective optimizationalgorithm, that will be presented in the next sections, to a first-order Pareto-criticalpoint.

5 The descent directions

In reference [13], a descent direction has been stated as the solution of the linearoptimization problem:

v∗ = argv minv,β

β

subject to:

{(W ′v

)i ≤ β, i = {1, . . . , m + r}

‖v‖∞ ≤ 1(21)

in which matrix W is defined by equation (6).In this paper, a slight improvement of such descent direction is obtained by impos-

ing that v = Wθ , with θ ∈ Rm+r a vector of non-positive components. The vector θ∗

is calculated by the auxiliary linear optimization problem:

θ∗ = argθ minθ,β

β

subject to:

⎧⎨

⎩

(W ′Wθ

)i ≤ β, i = {1, . . . , m + r}

‖Wθ‖∞ ≤ 1θi ≤ 0

(22)

The descent direction v is given by:

v = Wθ∗ (23)

The advantage of formulation (22) is that the descent direction v becomes constrainedto be a convex combination of the columns of the matrix composed by the negativegradient directions of the objective functions and by the negative gradient directionsof the active constraint functions. Such direction is likely to present a stronger descentbehavior than a descent direction that does not fulfill such condition.

Since every solution of (22) is a solution of (21) too, the only additional require-ment for using (22) instead of (21) becomes related to the existence of solutions of(22) under the same conditions that lead to the existence of solutions of (21). Thisresult is stated in the following lemma.

Lemma 4 The inequality

(W ′v

)i ≤ 0, i = {1, . . . , m + r}

123


has a solution v if and only if the inequality

(W ′Wθ

)i ≤ 0, i = {1, . . . , m + r}

has a solution θ with θi ≤ 0.

Proof The proof is a direct application of Farkas’ lemma. �Concerning the generation of a descent direction v, it should be noticed that:

• Any direction v satisfying(W ′v

)i < 0, i = {1, . . . , m+r} is a descent direction.

In a single-objective unconstrained problem, this condition would be satisfied byany vector v whose projection over the function gradient is negative.

• The additional condition v = Wθ, θi ≤ 0 is analogous to a “steepest descent”condition. In a single-objective unconstrained problem, this condition would besatisfied only by v in the opposite direction to the gradient.

• Instead of using the directions given by the optimization problems (21) or (22), itis also possible to perform a randomized search, using a random v that satisfiesboth

(W ′v

)i < 0, i = {1, . . . , m + r} and v = Wθ, θi ≤ 0.

• It should be noticed that such random search should not be performed without the“steepest descent” condition v = Wθ, θi ≤ 0, because the convergence rate couldbecome degraded. This degradation does not occur in [13] because the minimi-zation of β in the condition

(W ′v

)i ≤ β, i = {1, . . . , m + r} nearly induces a

“steepest descent” property in the resulting v.• A steered search can be performed if the automatic choice of the descent direction,

which is performed via (21) or via (22), or still via a randomized procedure asdiscussed above, is replaced by an interactive choice involving a query to a deci-sion-maker. For instance, the decision-maker could be asked to choose θ such thatv = Wθ, θi ≤ 0, and

(W ′v

)i < 0, i = {1, . . . , m + r}, with an instruction for

assigning larger values of |θi | to the objectives which should have priority.

6 The dominating cone line search method

With the descent direction v given by (23), and with the multiobjective line searchalgorithm L( f (·), ε), a line-search-based procedure for determining first order Pa-reto-critical points can be stated. Another version of the algorithm can be built using adescent direction given by (21). An iteration of such procedure is depicted in algorithm3.

Define the function D(·, ·, ·) such that:

xk+1 = D(F(·), xk, ε) (24)

in which F(·) : Rn �→ R

m is the vector function, xk ∈ Rn is the output the k-th

iteration that becomes the initial point of (k + 1)-th iteration, ε is the tolerance of theline search procedure, and xk+1 ∈ R

n is the output of (k + 1)-th iteration.This algorithm obeys the same convergence properties than the one presented in

[13], as stated in the following theorem.

123


Algorithm 3 Iteration of dominating cone line search method (DCLS)Input: F(·) : Rn �→ R

m and xk ∈ � and ε ∈ R+Output: xk+1

1: function xk+1 = D(F(·),xk ,ε)2: Compute W using (6)3: Compute v using either (23) or (21)4: f (α)← F(xk + αv)

5: α∗k ← L( f (·), ε)6: xk+1 ← xk + α∗k v

7: end function

Theorem 3 Every accumulation point of the sequence [xk]∞k=0 generated by xk+1 =D(F(·), xk, ε) with search direction provided by (21) is a Pareto-critical point. Ifthe function F(·) has bounded level sets in the sense that {x ∈ R

n|F(x) ≤ F(x0)} isbounded, then the sequence [xk]∞k=0 stays bounded and has at least one accumulationpoint.

Proof For a descent direction v given by (21), there exists a β(x, v) > 0 such that

F(x + αv) ≤ F(x)+ βαH ′(x)v (25)

holds for all (x + αv) ∈ R∗d , provided that x /∈ R∗, with the following strict equalityvalid for a point (x + αv) in the relative boundary of R∗d (denoted by ∂r

(R∗d

)):

∃α| (x + αv) ∈ ∂r

(R∗d

), Fi

(x + αv

) = Fi (x)+ βα(H ′(x)

)i v, Fj (x + αv)

≤ Fj (x)+ βα(H ′(x)

)j v (26)

for some i ∈ {1, . . . , m} and all j ∈ {1, . . . , m}. As long as (x + αv) ∈ R∗d is validat the end point of the golden section multiobjective line search, defining β accordingto (26) allows to state that (25) is a necessary condition for the acceptance of the steplength α in the golden section multiobjective line search rule. In our case, the value ofα which performs the transition from xk to xk+1 after q golden section multiobjectiveline search steps is bounded by:

α ≥ 0.382q

2

in which q ∈ N is the index of the golden section line search iteration. After replacingthe Armijo acceptance condition F(x+αv) < F(x)+βαH ′(x)v by (25) and stating aq that does not reach the acceptance condition for 0.382q/2 > α instead of 1/2q > α,a straightforward adaptation of the convergence proof of [13] can be performed forthis theorem. �NOTE 3 This proof holds for the version of the algorithm with the descent directiongiven by (21). An essentially similar proof can be stated for the case of a descentdirection given by (23).

123


NOTE 4 The algorithm 3 has been presented in an unconstrained version, for sim-plicity. The adaptation to the constrained case is straightforward.

7 Results

7.1 Simple problem

Firstly, a simple problem is used for the purpose of illustrating the algorithm perfor-mance. The comparison is performed with two other methods: a simple weighted sumscalarization and the steepest descent method proposed in [13].

Consider the bi-objective problem:

minx

F(x) (27)

with x ∈ R2 and

F(x) =[

F1(x)

F2(x)

]

F1(x) = (x − c1)′ Q1(x − c1)

F2(x) = 1− exp(− (x − c2)

′ Q2 (x − c2))

(28)

Q1 =[

1 0

0 2

]

Q2 =[

5 1

1 3

]

c1 =[

5

0

]

c2 =[

5

3

]

Note that function F1(·) is convex, and function F2(·) is quasi-convex. This problemwas solved by the proposed dominating cone line search method (DCLS) with descentdirection (23), by the method proposed in [13] with β = 0.5 and β = 1.0 (denotedrespectively by FS0.5 and FS1.0) and also by the weighting method of scalarization,with the scalarized function Fw(·) : R2 �→ R defined by:

Fw(x) = wF1(x)+ (1− w)F2(x)

0 < w < 1 (29)

and solved by the BFGS Quasi-Newton method2 using golden section line search [23],denoted here by W-BFGS. A total of 200 executions of each method were computed,with random initial point (every initial point was used for each method) and randomweighting w in the interval 0 < w < 1 for W-BFGS. The average number of gradient

2 The acronym BFGS comes from the initials of the authors: Broyden–Fletcher–Goldfarb–Shanno.

123


Table 6 Mean number of gradient evaluations and function evaluations needed for finding one efficientsolution of problem (27)–(28) using DCLS, FS0.5, FS1.0 and W-BFGS

Gradient evaluations Function evaluations in line search Total function evaluations

DCLS 19.52 468.48 507.52

W-BFGS 28.33 963.22 1019.88

FS0.5 51.94 470.76 574.64

FS1.0 42.83 470.76 556.42

The average is calculated for 200 runs, with random initial points (the same points for all methods) andrandom weight w in W-BFGS

evaluations and function evaluations needed for finding each solution, for these tests,are presented in Table 6.

These results show that all steepest descent algorithms with multiobjective linesearches are much faster than W-BFGS. The DCLS algorithm has taken only 2/3 of thegradient evaluations and 1/2 of the function evaluations that were spent by W-BFGS.The two versions of FS have also presented similar merit figures, compared with W-BFGS. This phenomenon is mainly due to the multiobjective line search procedures(either golden section or Armijo rule) that present intrinsically higher convergencerates than the single objective one, as discussed previously in this paper.

For the comparison of DCLS with FS, it should be noticed first that the FS ver-sions with β = 0.5 and 1.0 have presented very similar behaviors in both the numberof gradient evaluations and the number of function evaluations spent in line searchoperations. Comparing DCLS with FS, all algorithms have spent a similar numberof function evaluations in the line search operations, but DCLS has performed lessthan 1/2 of the number of gradient evaluations performed by FS. This means that: (i)DCLS has performed a number of line searches that is about 1/2 of the number of linesearches performed by FS; and (ii) in each line search, DCLS has spent twice the num-ber of function evaluations spent by FS for the stop condition to be reached. As eachgradient evaluation spends n additional function evaluations (within a finite differencegradient computation), the overall number of function evaluations spent by DCLS ismore than 10% smaller than the overal number of function evaluations spent by FS inthis problem. These results indicate that the enhanced precision that is associated tothe golden section line search (employed by DCLS) compared with the Armijo ruleline search (employed by FS) has allowed a smaller number of line searches to beperformed up to the final algorithm convergence. This enhanced precision, however,is associated to a higher function evaluation cost, and the total account, in the case ofthis problem, although slightly favourable to DCLS, indicates an equilibrium of thisalgorithm with FS.

7.2 Higher dimension problem

These results suggest a further investigation of the effect of problem dimension inthose merit figures, since it is expected that higher dimensions will require more linesearches up to the algorithm final convergence, but not more function evaluations per

123


10 20 30 40 50 60 70 80 90 1000

2000

4000

6000

8000

10000

12000

14000

16000

Problem Dimension

Fun

ctio

n C

alls

Golden SectionArmijo’s Rule

Fig. 5 The average overall number of function evaluations spent by DCLS and by FS0.5 up to the conver-gence to a Pareto-critical point versus the dimension n of problem (30)

line search. A series of tests has been conducted, for functions with the form:

minx

F(x) (30)

with x ∈ Rn , n = {10, 20, 30, . . . , 100}, and

F(x) =[

F1(x)

F2(x)

]

F1(x) = (x − c1)′Q1(x − c1)

F2(x) = (x − c2)′Q2(x − c2)

c1 = [1 1+ δ 1+ 2δ . . . 10] (31)

c2 = [10 10− δ 10− 2δ . . . 1]

Q1 = diag(c2)

Q2 = diag(c1)

δ = 9

n − 1

The tests have been performed for DCLS and for FS0.5. The resulting average over-all number of function evaluations for reaching a Pareto-critical point, starting fromrandom initial points (the same for both methods), for each method and each problemdimension is represented in Fig. 5. The golden section line search of DCLS providesmore precise and more costly line searches, compared with the Armijo rule line searchof FS0.5. The results presented in Fig. 5 show that, as the problem dimension increases,the balance between these two conflicting effects is favorable to DCLS, compared toFS0.5, under the viewpoint of the overall number of function evaluations requiredfor convergence to a Pareto-critical point. The advantage of DCLS grows more thanlinearly with the problem dimension.

123


10 20 30 40 50 60 70 80 90 100600

800

1000

1200

1400

1600

1800

2000

2200

Problem Dimension

Fun

ctio

n C

alls

Direction Eq. (23)

Direction Eq. (21)

Fig. 6 The average overall number of function evaluations spent up to the convergence to a Pareto-criticalpoint versus the dimension n of problem (30), using the original DCLS algorithm with descent direction(23) and using the DCLS algorithm modified with the descent direction (21)

Fig. 7 A 6-element Yagi-Udaantenna configuration

7.3 Evaluating the search direction

Another experiment has been conducted in order to evaluate the effect of employingthe descent direction proposed in this paper, given by equation (23), versus employingthe descent direction proposed in [13], given by the equation (21). The experimentemploys the same set of functions (30) and (31) that have been used in the formerexperiment. Now, only the multiobjective golden section line search is employed, andthe only difference between the algorithms is in the descent direction. The results arepresented in Fig. 6.

It can be noticed that the number of function calls of the DCLS algorithm whenthe descent direction (23) is employed becomes 10 to 20% lower than the number offunction calls when the descent direction (21) is employed. This experiment supportsthe conjecture that direction (23) is better than direction (21), although, as expected,the difference is not that remarkable.

7.4 Yagi-Uda antenna design

For the purpose of validating DCLS algorithm in a high-dimensional hard problem, itwas applied to the design of a 6-element wire Yagi-Uda antenna, illustrated in Fig. 7.The formulation of this problem, that was described in detail in [26], is as follows.

123


The element centered at the origin is the reflector, followed by the centered-feddriven element and the 4 directors. The distances d between consecutive elements(5 different distances) and the lengths L of each element are the parameters to be opti-mized (11 optimization parameters in total). The cross-section radius a is the same forall elements and is set equal to 0.003377 wavelengths at 859 MHz. The computationalsimulation of the antenna behavior follows the formulation described in [26].

The objective functions are set upon the antenna specifications, aiming the maxi-mization of the directivity, the front-to-back ratio, and the impedance matching, overthree different frequencies through the antenna bandwidth, resulting in nine differentobjectives. The design specifications upon the antenna radiation pattern are the high-est possible directivity and front-to-back ratio. The impedance matching is attainedby requesting an input resistance close to 50 � (or a voltage standing wave ratioclose to 1). Such requirements are imposed for three different frequencies (the lower,middle, and upper frequencies) over a 3.5% bandwidth centered at 859 MHz (828.935–889.065 MHz). All dimensions are given in wavelengths (λ) at 859 MHz. It should benoted that the objective functions, in this problem, are not guaranteed to be quasi-con-vex and differentiable.

The results of one typical run of DCLS algorithm on this problem are presented inTables 8 and 9 and in Fig. 8. DCLS was initialized, in this case, in the center point ofthe box constraints of the problem, shown in Table 7, and the optimization has taken39 algorithm iterations, with 706 function evaluations.

The monotonic behavior of the nine objective functions is illustrated in Fig. 8. Onlythe first 40 iterations are shown (after this, the objectives become almost constant).

An entire set of efficient solutions has been generated by DCLS, by starting thealgorithm at several randomly-generated initial points. The complete results of theapplication of DCLS to this problem have been published by the authors in [21], withmore details and a careful physical analysis. Up to the of the authors’ knowledge,there is no report in literature of any Yagi-Uda antenna with such performance. Forinstance, antennas with similar voltage standing wave ratio (VSWR) and input imped-ances (Zin) usually attain a directivity of less than 10 dB, and a front-to-back ratio ofless than 16 dB; see [26] and references therein.

The objective gradients have been calculated using the finite difference method.The average number of function evaluations for designing one antenna was of theorder of 2,000. However, a much smaller number could be reached if a more relaxedstop condition were applied.

This amount of computational effort is very competitive, for this class of problems.For instance, similar results have been presented by the NASA Ames Research Centerusing a dedicated Genetic Algorithm which required 600,000 evaluations per designedantenna [22].

7.5 Shape optimization of broad-band reflector antennas

Another real-world engineering problem solved using the proposed algorithm is theShape Optimization of Broad-band Reflector Antennas [20]. As a main objective,a specified region � must be illuminated uniformly. To accomplish this, a set of

123


Fig. 8 Evolution of antennadirectivity (Do), antennafront-to-back ratio (FB) andantenna voltage standing waveratio (VSWR), in the frequencies828.935 MHz (f1), 859 MHz(f2) and 889.065 MHz (f3)

0 5 10 15 20 25 30 35 406.5

7

7.5

8

8.5

9

9.5

10

10.5

11D(f1)D(f2)D(f3)

Iteration number

Do

0 5 10 15 20 25 30 35 408

10

12

14

16

18

20FBR(f1)FBR(f2)FBR(f3)

Iteration number

FB

0 5 10 15 20 25 30 35 401

1.5

2

2.5

3

3.5

4SWR(f1)SWR(f2)SWR(f3)

Iteration number

VSW

R

ns sample points, named P = {p1, . . . , pns

}, are spread over �, where the gain

G(p),∀p ∈ P is evaluated in relation to an isotropic radiation. This problem wasalso evaluated in three different frequencies to give broadband characteristics to theantenna (Table 10).

123


Table 7 Initial antennageometry, of the starting point ofDCLS

The dimensions are given inwavelengths (λ) at 859 MHz

L p dp−1,p

0.50000000000000

0.50000000000000 0.32375000000000

0.32375000000000 0.32375000000000

0.32375000000000 0.32375000000000

0.32375000000000 0.32375000000000

0.32375000000000 0.32375000000000

Table 8 Geometry of oneantenna obtained by DCLS, after39 algorithm iterations and 706function evaluations

The dimensions are given inwavelengths (λ) at 859 MHz

L p dp−1,p

0.50610055719041

0.44931282950414 0.25261747043376

0.38538718601364 0.25749579953324

0.39882084142996 0.37863536374455

0.38238790972033 0.38214346551819

0.38405677554186 0.33115644157639

Table 9 Electrical characteristics of the antenna of Table 8

Freq. MHz Do (dB) FB (dB) VSWR

828.935 9.715465 15.450613 1.335141

859 10.356968 18.961185 1.099307

889.065 10.918225 16.028036 1.798306

The variables shown in the table are: antenna directivity (Do), front-to-back ratio (FB), voltage standingwave ratio (VSWR)

Table 10 Gain distribution(mean μ and standard deviationσ at matching frequencies)considering the illumination ofthe Brazilian territory

Objective Initial Optimized

μL , (dBi) 17.46 31.77

μC , (dBi) 17.07 31.82

μU , (dBi) 16.58 31.82

σ L , (dB) 12.36 1.93

σC , (dB) 11.96 2.00

σU , (dB) 11.64 2.10

The result when a standard parabolic antenna is used is presented in Fig. 9(a), andthe optimized one in Fig. 9(b) that shows a pattern of illumination much closer to theBrazilian territory shape. Further details can be found in a paper by the authors [20].

123


−4 −3 −2 −1 0 1 2 3 4−4

−3

−2

−1

0

1

2

3

410

10

1010

10

10

1010

1010

1010

15

1515

1515

20

25

30

3540

44

CO−POL

AZ (º)

EL

(º)

−4 −3 −2 −1 0 1 2 3 4−4

−3

−2

−1

0

1

2

3

4

10

10

10

10

115

15

15

20

20

25

30

30

31

31

32

32

33

3333

34

34

35

CO−POL

AZ (º)E

L (º

)

(a) (b)

Fig. 9 Optimal radiation patterns (dBi) for the coverage of the Brazilian territory. In (a) the radiationpattern when the classical parabolic format is used. The algorithm has asymptotically converged to (b) thatis a better pattern to the coverage of the Brazilian territory

7.6 Remarks on the simulation results

The second and the third examples presented here are real world black box engineeringproblems, one with 11 variables and 9 objectives and the other with 6 objectives and38 variables. The gradients in both problems were extracted using the finite differencemethod. The exact natures of these problems are unknown; however, it is believed thatthey are multimodal and differentiable problems. The results obtained and publishedusing the algorithm presented in this paper have outperformed all the results for sim-ilar problems that appear in the current literature, in terms of the number of functionevaluations and design quality. Often, in engineering problems, a sub-optimal solutionis available, as the classical paraboloid solution in the third example. Therefore, analgorithm capable of a fast local search, with the aim of enhancing such solution, isvery useful for this type of design.

8 Final remarks

A new algorithm for generating first-order Pareto-critical solutions in multiobjectiveoptimization problems (called DCLS) has been proposed here, as a reformulation ofa former algorithm presented in [13].

DCLS has two basic functional blocks: (i) the computation of a direction in whichthere are points that dominate the current one; and (ii) the computation of a step-sizethat leads to a line-constrained efficient point. The iterative application of these stepsultimately leads to a Pareto-critical point under an assumption of bounded set-lev-els of the objective functions in the feasible set. The algorithm terminates when step(i) cannot be executed, which means that the first-order Kuhn–Tucker conditions for

123


efficiency hold in the current point. This basic structure is the same that has beenemployed in [13].

The main differences of DCLS in relation to the algorithm presented in [13] are: (i)the line search method is performed here with a new golden section line search multi-objective procedure, that takes advantage of the structure of the Pareto-set constrainedto a line, in order to further reduce the required number of function evaluations, whencompared with the single-variable golden section procedure, and (ii) the descent direc-tion is constrained, here, to belong to the set of convex combinations of the negativefunction gradients and negative active constraint gradients—this enhances the descentproperty of the chosen direction.

The performance of DCLS has been tested in the design of electromagnetic devices.The difficulties to solve this type of problems have been, in fact, the primary motiva-tion to develop the algorithm presented here. Both problems present high sensitivitythat makes the convergence rate of other algorithms very slow, as discussed in [21]and [20]. The tests presented here suggest that DCLS can be the basis for buildingpowerful engineering design tools. The proposed algorithm is currently being testedin other real-world problems.

Acknowledgments The authors acknowledge the support by the Brazilian agencies CAPES, CNPq andFAPEMIG.

References

1. Benson, H.P.: Existence of efficient solutions for vector maximization problems. J. Optim. TheoryAppl. 26(4), 569–580 (1978)

2. Bosman, P.A.N., de Jong, E.D.: Exploiting gradient information in numerical multi-objective evolu-tionary optimization. In: Proceedings of the 2005 Genetic and Evolutionary Computation Conference(GECCO’05), pp. 755–762. ACM, Washington June (2005)

3. Chankong, V., Haimes, Y.Y.: On the characterization of noninferior solutions of the vector optimizationproblem. Automatica 18(6), 697–707 (1982)

4. Chankong, V., Haimes, Y.Y.: Multiobjective decision making: theory and methodology. Elsevier,Amsterdam (1983)

5. Coello Coello, C.A., Van Veldhuizen, D.A., Lamont, G.B.: Evolutionary algorithms for solving multi-objective Problems. Kluwer Academic Publishers, Dordrecht (2001)

6. Deb, K.: Multi-objective optimization using evolutionary algorithms. Wiley, London (2001)7. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm:

NSGA II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)8. Dellnitz, M., Schutze, O., Hestermeyer, T.: Covering Pareto sets by multilevel subdivision techniques.

J. Optim. Theory Appl. 124(1), 113–136 (2005)9. Ehrgott, M.: Multicriteria optimization, volume 491 of lecture notes in economics and mathematical

systems. Springer Verlag, Berlin (2000)10. Engau, A., Wiecek, M.M.: Cone characterizations of approximate solutions in real vector optimiza-

tion. J. Optim. Theory Appl. 134(3), 499–513 (2007)11. Fliege, J.: Gap-free computation of Pareto-points by quadratic scalarizations. Math. Methods Oper.

Res. 59, 69–89 (2004)12. Fliege, J., Grana-Drummond, L.M., Svaiter, B.F.: Newton’s method for multiobjective optimiza-

tion. SIAM J. Optim. 20(2), 602–626 (2009)13. Fliege, J., Svaiter, B.F.: Steepest descent methods for multicriteria optimization. Math. Methods Oper.

Res. 51, 479–494 (2000)14. Fonseca, C.M., Fleming, P.: Genetic algorithms for multiobjective optimization: formulation, discus-

sion and generalization. In: Proceedings of the 5th International Conference: Genetic Algorithms,pp. 416–427. San Mateo (1993)

123


15. Fonseca, C.M., Fleming, P.J.: An overview of evolutionary algorithms in multiobjective optimiza-tion. Evol. Comput. 7(3), 205–230 (1995)

16. Gould, F.J., Tolle, J.W.: A necessary and sufficient qualification for constrained optimization. SIAMJ. Appl. Math. 20(2), 164–172 (1971)

17. Hillermeier, C.: Generalized homotopy approach to multiobjective optimization. J. Optim. TheoryAppl. 110(3), 557–583 (2001)

18. Jeyakumar, V., Luc, D.T.: Nonsmooth vector functions and continuous optimization. Springer, Berlin(2008)

19. Klamroth, K., Tind, J., Wiecek, M.M.: Unbiased approximation in multicriteria optimization. Math.Methods Oper. Res. 56, 413–437 (2002)

20. Lisboa, A.C., Vieira, D.A.G., Vasconcelos, J.A., Saldanha, R.R., Takahashi, R.H.C.: Multi-objectiveshape optimization of broad-band reflector antennas using the cone of efficient directions algorithm.IEEE Transactions on Magnetics, pp. 1223–1226 (2006)

21. Lisboa, A.C., Vieira, D.A.G., Vasconcelos, J.A., Saldanha, R.R., Takahashi, R.H.C.: Monotonicallyimproving Yagi-Uda conflicting specifications using the dominating cone line search method. IEEETrans. Magn. 45(3), 1494–1497 (2009)

22. Lohn, J.D., Kraus, W.F., Colombano, S.P. Evolutionary optimization of Yagi-Uda antennas. In Pro-ceedings of the Fourth International Conference on Evolvable Systems, pp. 236–243 (2001)

23. Luenberger, D.G.: Linear and nonlinear programming. Addison-Wesley, Reading (1984)24. Pareto, V.: Manual of political economy. Augustus M. Kelley, New York (1906). (1971 translation of

1927 Italian edition)25. Pereyra, V.: Fast computation of equispaced Pareto manifolds and Pareto fronts for multiobjective

optimization problems. Math. Comput. Simulat. 79(6), 1935–1947 (2009)26. Ramos, R.M., Saldanha, R.R., Takahashi, R.H.C., Moreira, F.J.S.: The real-biased multiobjective

genetic algorithm and its application to the design of wire antennas. IEEE Trans. Magn. 39(3), 1329–1332 (2003)

27. Romero, C.: A survey of generalized goal programming (1970–1982). Eur. J. Oper. Res. 25, 183–191(1986)

28. Schaffler, S., Schultz, R., Weinzierl, K.: Stochastic method for the solution of unconstrained vectoroptimization problems. J. Optim. Theory Appl. 114(1), 209–222 (2002)

29. Schütze, O., Laumanns, M., Coello-Coello, C.A., Dellnitz, M.l., Talbi, E.G.: Convergence of stochasticsearch algorithms to finite size Pareto set approximations. J. Global Optim. 41, 559–577 (2008)

30. Wanner, E.F., Guimaraes, F.G., Takahashi, R.H.C., Fleming, P.J.: Local search with quadratic approxi-mations into memetic algorithms for optimization with multiple criteria. Evol. Comput. 16(2), 185–224(2008)

31. Yano, H., Sakawa, M.: A unified approach for characterizing Pareto optimal solutions of multiobjectiveoptimization problems: the hyperplane method. Eur. J. Oper. Res. 39, 61–70 (1989)

32. Yu, P.L.: Cone convexity, cone extreme points, and nondominated solutions in decision problems withmultiobjectives. J. Optim. Theory Appl. 14, 319–377 (1974)

33. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength pareto evolutionary algorithm.Technical report 103, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Instituteof Technology (ETH) Zurich (2001)

123

Multicriteria optimization with a multiobjective golden section line search

Documents

Transcript of Multicriteria optimization with a multiobjective golden section line search