Fractally Aggregated Micro and Nanosystems Synthesized from Sols
Using error bounds to compare aggregated generalized transportation models
Transcript of Using error bounds to compare aggregated generalized transportation models
Ann Oper Res (2006) 146:119–134
DOI 10.1007/s10479-006-0051-6
Using error bounds to compare aggregated generalizedtransportation models
Igor S. Litvinchev · Socorro Rangel
Published online: 24 June 2006C© Springer Science + Business Media, LLC 2006
Abstract A comparative study of aggregation error bounds for the generalized transportation
problem is presented. A priori and a posteriori error bounds were derived and a computational
study was performed to (a) test the correlation between the a priori, the a posteriori, and the
actual error and (b) quantify the difference of the error bounds from the actual error. Based on
the results we conclude that calculating the a priori error bound can be considered as a useful
strategy to select the appropriate aggregation level. The a posteriori error bound provides a
good quantitative measure of the actual error.
Keywords Clustering . Network models . Approximation algorithms
Introduction
An important issue in optimization modelling is the trade-off between the level of detail to
employ and the ease of using and solving the model. Aggregation/disaggregation techniques
have been developed to facilitate the analysis of large models and provide the decision maker a
set of tools to compare models that have a high degree of detail with smaller aggregated models
that have less detail. Aggregated models are simplified, yet approximate, representations of
the original complex model and may be used in various stages of the decision making process.
Frequently, a number of aggregated models can be considered for the same original model,
and it is then desirable to select an aggregated model providing the tightest (in some sense)
approximation.
In optimization-based modelling, the difference between the actual optimal objective
function value and the aggregated optimal objective function value can be used as a measure
I. S. LitvinchevComputing Center Russian Academy of Sciences, Vavilov 40, Moscow 119991, Russia
S. Rangel ( )Department of Computer Science and Statistics, UNESP, Rua Cristovao Colombo, 2265,S.J. Rio Preto, SP 15054-000, Brazile-mail: [email protected]
Springer
120 Ann Oper Res (2006) 146:119–134
of the error introduced by aggregation. Many upper bounds on this error have been developed
for the aggregated optimization models and are referred to as error bounds (Leisten, 1997;
Litvinchev and Rangel, 1999; Litvinchev and Tsurkov, 2003, and references therein). Two
types of error bounds are considered: a priori error bounds and a posteriori error bounds.
A priori error bounds may be calculated after the model has been aggregated but before
the smaller aggregated model has been optimized. A posteriori error bounds are calculated
after the aggregated model has been optimized. Error bounds allow the modeler to compare
various types of aggregation and, if the error bounds are not adequate, modify the aggregation
strategy in an attempt to tighten the error bounds (Rogers et al., 1991).
There are two main issues associated with the error bounds. The first one is addressed in
the difference of an error bound from the actual error. The closer the error bound to the actual
error, the better. The second issue is how to compare different aggregation strategies based
on the error bounds. It is not obvious that the aggregation yielding the tightest error bound
has the smallest actual error (Rogers et al., 1991). Moreover different bound calculation
techniques may result in a different relationship (correlation) between the error bound and
the actual error.
The first issue above is relatively well explored, at least for the case of a posteriori error
bounds for linear programming (see Leisten, 1997; Litvinchev and Tsurkov, 2003; Rogers
et al., 1991, and the references therein). The second area is much less investigated. To our
knowledge the first comprehensive study in this direction has been made recently in Norman,
Rogers, and Levy (1999) for the classical transportation problem (TP) where they explored,
among others, the following question:� If a model is aggregated using different methods to create several different smaller models
and an error bound (a priori and/or a posteriori) is calculated for each aggregated model,
does the model with the tightest error bound(s) give the closest approximation of the
objective function value?
Many applications of aggregation/disaggregation theory in optimization involve network
problems such as location models (Francis, Lowe, and Tamir, 2002), the generalized assign-
ment model (Hallefjord, Jornsten, and Varbrand , 1993) and, in particular, the TP. For the
latter both a priori and a posteriori error bounds were developed and investigated (Rogers
et al., 1991; Zipkin, 1980a). In network flow problems the nodes traditionally correspond
to constraints, and arcs to variables. For most aggregation schemes, aggregation of nodes
(constraints) implies or necessitates a corresponding aggregation of arcs (variables). Usually,
only variables and constraints of the same type (e.g., sources or destinations) are aggregated.
The TP is for minimizing the cost of transporting supplies from the set of sources to the
set of destinations, and it is assumed that the transportation flow is conserved on every arc
between the source and the destination. In the generalized transportation problem (GTP) the
amount of flow that leaves the source can differ from the amount of flow that arrives to the
destination. A certain multiplier is associated with each arc to represent a “gain” or a “loss”
of a unit flow between the source and the destination. There are two common interpreta-
tions of the arc multipliers (Ahuja, Magnanti, and Orlin, 1993). For the first interpretation
the multipliers are employed to modify the amount of flow of some particular item. There-
fore it is possible to model situations involving physical or administrative transformations,
for example, evaporation, seepage, or monetary growth due to the interest rates. For the
second interpretation, the multipliers are used to transform one type of item into another
(manufacturing, machine loading and scheduling, currency exchanges, etc.). For a more de-
tailed discussion on the GTP and its applications see Ahuja, Magnanti, and Orlin (1993) and
Glover, Klingman, and Phillips (1990). In contrast to the TP, much less work has been done
Springer
Ann Oper Res (2006) 146:119–134 121
in aggregation for the GTP. Only a posteriori error bounds for aggregation in the GTP have
been developed in Evans (1979).
In this paper we present a study of the correspondence between the error bounds and
the actual error for the aggregated GTP. The paper is organized as follows. In Section 1,
we state the aggregated GTP and consider its properties. In the next section, a family of a
priori error bounds for a class of the GTP is derived, and new a posteriori error bounds are
proposed. A numerical study carried out to compare the quality of the error bounds and to
determine if there are relationships between the error bounds and the actual error is presented
in Section 3. Based on the results of this study we can conclude that there is a significant
difference among the various methods of calculating error bounds. Moreover, the improved a
priori and a posteriori error bounds have a statistically significant correlation with the actual
error.
1. The original and the aggregated models
The original GTP before aggregation (OP) is considered in the following form Evans (1979):
(OP) z∗ = min∑i∈S
∑j∈T
ci j xi j
s. t.∑j∈T
di j xi j ≤ ai , (i ∈ S),
∑i∈S
xi j = b j , ( j ∈ T ),
xi j ≥ 0, (i ∈ S, j ∈ T ).
Here S and T are the set of all sources and all destinations (customers), respectively; xi j is
the amount of flow from a source node i to a destination node j ; ci j is the transportation cost;
di j is the flow gain; ai is the node supply, and b j is the node demand. It is assumed that all
the coefficients in (OP) are positive.
The role of sources and destinations in (OP) can be reversed (Evans, 1979) by a simple
transformation of variables: yi j = di j xi j . This transformation removes the arc multipliers
from the first group of constraints in (OP) and places them in the second group. Taking into
account this correspondence between the sources and the destinations we consider here only
aggregation of the destinations (customers) in (OP). The aggregation of the sources can be
treated similarly.
The aggregation is defined by a partition of the set T of the destinations into a set T A
of clusters Tk, k = 1, . . . , K such that Tk ∩ Tp = ∅, k = p and⋃ K
k=1 Tk = T . Note that
the columns in the (OP) are also partitioned/clustered respectively. The fixed-weight variable
aggregated linear programming problem (Zipkin, 1980b) is derived by replacing the columns
in each cluster by their weighted average. The objective coefficients are combined similarly.
For each k, consider the non-negative weights gi j , i ∈ S, j ∈ Tk fulfilling the following
normalizing conditions:
∑j∈Tk
gi j = 1. (1)
Springer
122 Ann Oper Res (2006) 146:119–134
The aggregated problem can be obtained by fixing the normalized weights gi j and substituting
the linear transformation (disaggregation):
xi j = gi j Xik, ( j ∈ Tk) (2)
into (OP) (Litvinchev and Rangel, 1999). Thus the aggregated problem is:
z = min∑i∈S
∑k∈T A
Xik
∑j∈Tk
ci j gi j (3)
s.t.∑k∈T A
Xik
∑j∈Tk
di j gi j ≤ ai , (i ∈ S), (4)
∑i∈S
gi j Xik = b j , ( j ∈ Tk, k ∈ T A), (5)
Xik ≥ 0, (i ∈ S, k ∈ T A). (6)
To write (3)–(6) we have used the observation that∑
j∈T = ∑k∈T A
∑j∈Tk
.
From (1) and (2) it follows that for any normalized weights we have Xik = ∑j∈Tk
xi j and
hence Xik is the supply from the source i to the aggregated customer Tk . However, (3)–(6)
still has |T | restrictions (5) corresponding to destinations, and hence cannot be considered
as some GTP associated with S sources and T A destinations. The number of constraints (5)
can be reduced as follows.
Let the weights be chosen as gi j = b j/∑
j∈Tkb j , j ∈ Tk . Then for each k ∈ T A in (5)
we obtain |Tk | conditions of the same form∑
i∈S Xik = ∑j∈Tk
b j . The redundant constraints
can be removed without changing the primal solution to form the problem which is usually
refereed to as the aggregated GTP (Evans, 1979):
(AP) z = min∑i∈S
∑k∈T A
cik Xik
s.t.∑k∈T A
dik Xik ≤ ai , (i ∈ S),
∑i∈S
Xik = bk, (k ∈ T A),
Xik ≥ 0 (i ∈ S, k ∈ T A),
where cik = ∑j∈Tk
ci j gi j , dik = ∑j∈Tk
di j gi j , bk = ∑j∈Tk
b j .
We will denote by (AP) the aggregated problem obtained for the weights gi j before
removing the redundant constraints. The primal solutions Xik of (AP) and (AP) are the same
and the fixed-weight disaggregated solution xi j = gi j X ik, j ∈ Tk is feasible to the original
problem (OP). The dual problem to (AP) always has multiple optimal solutions, even if the
optimal dual solution to (AP) is unique, as supported in Proposition 1:
Proposition 1. Let {ui , vk} be an optimal dual solution to (AP). Denote
D(v) ={
v j , j ∈ T
∣∣∣∣ ∑j∈Tk
v j b j = vkbk, k = 1, . . . , K
}. (7)
Springer
Ann Oper Res (2006) 146:119–134 123
Then any pair {u, v} with v ∈ D(v) is an optimal dual solution to (AP) and
z = −∑i∈S
ai ui +∑k∈T A
∑j∈Tk
v j b j . (8)
Proof: Consider the dual to (AP) or, which is the same, the dual to (3)–(6) for gi j = gi j :
(DAP) −∑i∈S
ai ui +∑k∈T A
∑j∈Tk
v j b j → max
s.t. cik + dikui −∑j∈Tk
v j gi j ≥ 0, (i ∈ S, k ∈ T A),
ui ≥ 0, (i ∈ S).
The dual to (AP) is:
(DAP) −∑i∈S
ai ui +∑k∈T A
vk
∑j∈Tk
b j → max
s.t. cik + dikui − vk ≥ 0, (i ∈ S, k ∈ T A),
ui ≥ 0, (i ∈ S). �
These two dual problems should have equal optimal value since the primal problems (AP)
and (AP) have the same optimal solutions. Let {ui , vk} be an optimal solution to (DAP).
Construct {ui , v j } as follows: ui = ui and v j is such that∑
j∈Tkv j gi j = vk . Comparing the
restrictions of (DAP) and (DAP) we see that this {ui , v j } is feasible to (DAP). Moreover,
since gi j = b j/∑
j∈Tkb j it is not hard to verify that the objective value of (DAP) for {ui , v j }
is equal to the optimal objective of (DAP). Hence this {ui , v j } is optimal to (DAP).
Note, that the nonuniqueness of the optimal duals in (AP) follows from the way the
aggregation was performed, no matter whether (AP) has a unique optimal dual solution or
not. The nonuniquiness of the duals will be used further to derive a priori error bounds for
the GTP.
2. The error bounds
If the fixed weights gi j are used to form the aggregated problem (AP), the optimal value
z of the objective function for (AP) is also an upper bound for the optimal objective z∗ to
the original problem, such that z − z∗ ≥ 0. This follows from the fact that the disaggregated
solution xi j = gi j X ik, j ∈ Tk is feasible to the original problem.
To get an upper bound on the value z − z∗ we will use the following result established
in Litvinchev and Rangel (1999) for the fixed-weight aggregation in convex programming:
if W is a closed convex and bounded set containing an optimal solution to the original
problem (such a W is called a localization), then z − z∗ ≤ minx∈W L(x, τ ). Here L(·, ·) is
the standard Lagrange function associated with the original problem and τ is a vector of
Lagrange multipliers. Assume that the condition x ≥ 0 is always included in the definition
of W .
Springer
124 Ann Oper Res (2006) 146:119–134
For the GTP, τ = {u, v}, the Lagrangian has the form:
L(x, u, v) = −∑i∈S
ui ai +∑j∈T
v j b j +∑i∈S
∑j∈T
(ci j + ui di j − v j )xi j , (u ≥ 0)
and hence for any fixed dual multipliers u ≥ 0 and v we have:
z − z∗ ≤(
z +∑i∈S
ui ai −∑j∈T
v j b j
)+ max
x∈W
∑i∈S
∑j∈T
(v j − ci j − ui di j )xi j ≡ ε(u, v, W ).
(9)
In particular, if ui , v j is an optimal solution of the dual to (AP), then:
−∑i∈S
ui ai +∑j∈T
v j b j = z and
ε (u, v, W ) = maxx∈W
∑i∈S
∑j∈T
(v j − ci j − ui di j )xi j . (10)
2.1. A priori error bounds
To get an a priori error bound for z − z∗ by (10) we need to estimate the objective coefficients
of the maximization problem in (10) without solving the aggregated problem. For the TP this
can be done by choosing ui = ui , v j = vk, j ∈ Tk, that is v ∈ D(v). To see this, note that
for di j = 1, the term in the brackets in (10) is vk − ci j − ui and by the restrictions of (DAP)
this is less than or equal to cik − ci j , j ∈ Tk, which gives:
z − z∗ ≤ maxx∈W
∑i∈S
∑j∈T
(cik( j) − ci j
),
where k( j) is the cluster to which j belongs, k( j) ∈ T A. If the localization W is defined beforean aggregated problem has been solved (we will refer to such W as a priori localization),
then the inequality above provides the a priori error bound for the TP. However, if di j = 1,
we can’t estimate the coefficients in (10) under the same choice of the duals u, v. For this
reason it was conjectured in Evans (1979) and Rogers et al. (1991) that “. . . a priori boundscannot be obtained for the GTP as is possible for simple transportation problems”. To cope
with this problem, we will use the nonuniqueness of the optimal dual multipliers for the (AP),
as stated in Proposition 1.
Proposition 2. Let di j = pi t j , where pi > 0, t j > 0. Then there exists v ∈ D(v) such thatfor the pair u, v we have:
v j − ci j − ui di j ≤ (cikdi j − ci j dik)/dik, (i ∈ S, j ∈ Tk).
Proof: From the restrictions of (DAP) we have ui ≥ (vk − cik)/dik and hence:
v j − ci j − ui di j ≤ [(cikdi j − ci j dik) − (vkdi j − v j dik)]/dik .
Springer
Ann Oper Res (2006) 146:119–134 125
By the definition of dik and the correspondence between vk and v j , j ∈ Tk we have:∑j∈Tk
b j (vkdi j − v j dik) = 0 (∀i).
Since b j > 0, then the case (vkdi j − v j dik) > 0 (<0) for all j ∈ Tk is impossible.
Now choose v j to fulfill conditions vkdi j − v j dik = 0 for all j ∈ Tk . Resolving the latter
system of equations with respect to v j for di j = pi t j and dik = (∑
j∈Tkb j di j )/(
∑j∈Tk
b j )
we obtain:
v j = vk t j
∑j∈Tk
b j∑j∈Tk
b j t j, ( j ∈ Tk).
It is simple to check that for this v j we have∑
j∈Tkv j b j = vk
∑j∈Tk
b j . Hence v ∈ D(v)
and the claim follows. �
Proposition 2 together with (10) yields immediately the following expression for the a
priori error bound:
Corollary . For di j = pi t j we have:
z − z∗ ≤ maxx∈W
∑i∈S
∑j∈T
δai j xi j ≡ εa(W ),
where
δai j ≡
(cik( j)
dik( j)
− ci j
di j
)di j
and k( j) is the cluster to which j belongs, k( j) ∈ T A.
Remark 1. Under the assumption di j = pi t j we have∑
j∈T t j xi j ≤ ai/pi , i ∈ S in the
“source” restrictions of the original problem. This means that the “gain” (“loss”) along the
arc (i, j) is the same for all sources i connected with a destination j . We can treat this case as
an intermediate one between the TP, where the gains are the same for all arcs, and a general
form of the GTP, where the gains vary from arc to arc. Interpreting the GTP as the machine
loading problem (see, e.g., Ahuja, Magnanti, and Orlin (1993), and Glover, Klingman, and
Phillips (1990)), producing one unit of product j on machine i consumes di j hours of the
machine’s time. If pi is interpreted as a production “velocity” of the machine i, and t j is a
“length” of the product j measured in suitable units, then the assumption di j = pi t j holds.
A priori localizations can be defined, for example, by manipulating the constraints of the
original problem. Let
Wb = {xi j | 0 ≤ xi j ≤ ri j , i ∈ S, j ∈ T }, ri j = min{b j , ai/di j },
Ws ={
xi j
∣∣∣∣ ∑j∈T
di j xi j ≤ ai , i ∈ S, xi j ≥ 0 ∀i, j
},
Wd ={
xi j
∣∣∣∣ ∑i∈S
xi j = b j , j ∈ T, xi j ≥ 0 ∀i, j
},
Springer
126 Ann Oper Res (2006) 146:119–134
where the lower indices b, s, and d show that the restrictions defining the respective localiza-
tion are associated with bounds, sources and destinations. Obviously Ws, Wd are localizations
as well as Wbs = Wb ∩ Ws and Wbd = Wb ∩ Wd . For these four localizations the optimization
problem to calculate εa(W ) can be solved analytically. Due to the decomposable structure
of the localizations, the computation of εa(Ws), εa(Wd ) results in a number of independent
single-constrained continuous knapsack problems, which yields
εa(Ws) =∑i∈S
ai maxj∈T
[δa
i j
]+/di j ,
εa(Wd ) =∑j∈T
b j maxi∈S
δai j ,
where [α]+ = max{0, α}. Similarly, the calculation of εa(Wbs), εa(Wbd ) is reduced to the
solution of independent single-constrained knapsack problems with upper bounds for the
variables. The respective solutions can also be obtained analytically (see, e.g., Litvinchev
and Rangel, 1999).
By construction we have min{εa(Wb), εa(Ws)} ≥ εa(Wbs) and similarly for Wd . Then
εaBest is defined as
εaBest = min{εa(Wbs), εa(Wbd )}.
It is interesting to compare the obtained a priori error bounds with the results known for the
TP (Zipkin, 1980a). Let di j = 1, i ∈ S, j ∈ T and all “≤ ” in the linear restrictions of (OP)
be substituted for “=”. Then δai j = cik( j) − ci j and to get the estimation stated in proposition
2 we should put v j = vk, j ∈ Tk . The expressions for our four a priori error bounds will
continue to be valid for the classical transportation problem if we change [δai j ]
+ for δai j . Then
εa(Ws) and εa(Wd ) coincide with the a priori error bounds obtained in Zipkin (1980a) for
the customer aggregation in the TP. Our error bound εaBest is at least as good as min{εa(Ws),
εa(Wd )}.
2.2. Tightening the a priori error bounds
From the definition of εa(W ) it follows that the tighter the localization W , the better (smaller)
the error bound εa(W ). Defining W by the original constraints, we should retain as many
constraints as possible. Alternatively, localization W should be simple enough to guarantee
that the computational burden associated with the calculation of the error bound is not too
expensive. A compromise decision is to utilize “easy” restrictions directly, while dualizing
“complicated” restrictions by the Lagrangian.
Suppose that the localization Wall is defined by all constraints of (OP) and the destination
constraints are treated as “easy”, while the source constraints are considered as “complicated”.
Then by the Lagrangian duality we have :
εa(Wall ) ≡ maxx∈Wall
∑i∈S
∑j∈T
δai j xi j = min
u≥0
{ ∑i∈S
ui ai + maxx∈Wd
∑i∈S
∑j∈T
(δa
i j − ui di j)xi j
}≡ min
u≥0εa(u, Wd ) ≤ εa(0, Wd ) ≡ εa(Wd ),
where εa(u, Wd ) denotes the expression in the figure brackets.
Springer
Ann Oper Res (2006) 146:119–134 127
After the error bound εa(0, Wd ) = εa(Wd ) has been calculated, it is natural to look for
another value of u ≥ 0 to diminish the current error bound. If s is the search direction, we
get u = ρs, where the stepsize ρ ≥ 0 and the components of the search direction should be
nonnegative.
Let
εa(W ) = maxx∈W
∑i∈S
∑j∈T
δai j xi j ,
and denote by x(W ) its optimal solution. Consider that εa(Wd ) has been calculated and let
x(Wd ) be unique. Then by the marginal value theorem, εa(u, Wd ) is differentiable at u = 0,
and ∇ui εa(0, Wd ) = ai − ∑
j∈T di j xi j . Now take one step of the projected gradient technique
for the problem minu≥0 εa(u, Wd ) starting from u = 0. This gives
ui (ρ) = ρ max
{0,
∑j∈T
di j xi j − ai
}(i ∈ S, ρ ≥ 0).
The stepsize ρ associated with the steepest descent can be defined by a one-dimensional
search: under the uniquiness of x the choice of ui (ρ) guarantees a strict decrease of the a
priori error bound, that is either εa (u, Wd ) < εa(0, Wd ) = εa(Wd ) or εa(Wd ) = εa(Wall ). The
other a priori error bounds can be strengthened similarly. We will refer to the a priori error
bound εa(W ), tightened by searching the duals in the direction of the projected anti-gradient
as εa∇ (W ).
2.3. A posteriori error bounds
Suppose now that (AP) has been solved and the values u and v are known. The expression
for ε (u, v, W ) in (10) still provides a valid error bound and we only need to construct u, v
optimal to (AP). By Proposition 1 we may put u = u and v j = vk, j ∈ Tk .
We can calculate the a posteriori error bounds ε p (u, v, W ) analytically for the same
localizations used in Section 2.1 since:
z − z∗ ≤ maxx∈W
∑i∈S
∑j∈T
δpi j xi j ≡ ε p (u, v, W ),
where δpi j ≡ v j − ci j − ui di j . Hence we only need to substitute δ
pi j for δa
i j in the expressions
for the a priori error bounds obtained earlier. Note that the a posteriori error bound is valid
for the GTP in its general form, without any assumptions about the specifics of the arc
multipliers.
For the choice of u, v described above, the error bounds ε p (u, v, Ws) and ε p (u, v, Wd )
coincide with the a posteriori error bounds obtained in Evans (1979). The new error bound
εpBest = min{ε p (u, v, Wbs), ε p (u, v, Wbd )} obtained for the same choice of u, v is at least as
good as ε p (u, v, Ws), ε p (u, v, Wd ), and a numerical study below demonstrates that in many
cases it is significantly better.
After the a posteriori error bound has been calculated for some fixed u ≥ 0, v, it is natural
to look for another dual solution to decrease the error bound. Let the duals {u, v} be changed
in the direction {s, q} with the stepsize ρ, such that u + ρs ≥ 0. Then the best (i.e., error
Springer
128 Ann Oper Res (2006) 146:119–134
bound minimizing) duals in this direction are defined from the problem:
minρ:u+ρs≥0
ε p (u + ρs, v + ρq, W ).
The dual-searching approaches to tightening the a posteriori error bound are well known for
linear programming aggregation (see Leisten (1997) and the references therein). Mendelssohn
(1980) proposed to look for the new dual solution in the direction of the dual aggregated
solution, i.e., for our case s = u, q = v. Shetty and Taylor (1987) used the right-hand side
vector of the original problem as the search direction. In Litvinchev and Rangel (1999)
one step of the projected gradient technique was used, similar to what was done in Section
2.2 for the a priori error bounds. In all cases the stepsize ρ can be obtained either by a
one-dimensional search or by analyzing the non-differentiability points of the expression for
ε p (u + ρs, v + ρq, W ) as the function of the stepsize. The detailed description and examples
of usage of error bounds for Mendelssohn, Taylor and Shetty, and Litvinchev and Rangel can
be found in Litvinchev and Tsurkov (2003). We will denote these error bounds by εpM (W ),
εpST (W ), and ε
p∇ (W ), respectively, while ε p(W ) is used to denote the bound ε p (u, v, W ).
Thus for each of the four localizations Wd , Ws, Wbs, and Wbd we can calculate two a priori
error bounds εa(W ), εa∇ (W ) and four a posteriori error bounds ε p(W ), ε
pM (W ), ε
pST (W ), and
εp∇ (W ).
3. Numerical study
We tested the error bounds derived in this paper for the GTP with di j = pi t j . The objective of
the study is to compare the performance of the proposed error bounds regarding the difference
from the actual error and the correlation with it. Our numerical study is based on the study
design presented in Norman, Rogers, and Levy (1999). To analyze the results we used the
Pearson Correlation Coefficient, descriptive statistics and the hypothesis test for the mean of
two independent samples (t-test) (Montgomery, 2001).
The problem instances were generated using the open source code GNETGEN, which is
a modification by Glover of the NETGEN generator (Klingman, Napier, and Stutz, 1974).
Five problem sizes grouped in two sets were considered: SET1 of relatively small problems
and SET2 of relatively large problems. The problems have the following dimensions: SET1
(30×50, 50×100, 100×150) and SET2 (50×1000, 50×2000), such that the number of desti-
nations is greater (or significantly greater) than the number of sources. For each problem size
ten instances were randomly generated using random number seeds. The available supplies
ai were randomly generated with∑
ai = 104 for all instances in SET1 and∑
ai = 1.8 · 106
for all instances in SET2. The cost coefficients ci j and the arc multipliers di j were randomly
generated in the intervals [2.0, 20.0] and [0.5, 3.0], respectively.
The destinations are clustered in 5 levels: 80, 60, 40, 20 and 10% of its total number.
This clustering is referred to as level 1 - level 5 in the tables below (the higher the level,
the smaller the number of destinations). Each cluster has �|T |/|T A|� destinations sorted by
their index, except the last cluster that contains all the others (�α� denotes the integer part
of α). There are a total of (5 problem sizes)×(10 problem instances)×(5 cluster levels) =250 aggregated instances (150 in Set1 and 100 in SET2). Further we refer to the aggregated
instances as simply instances.
Four localizations were used to calculate the aggregation error bounds: Wd , Ws, Wbs, and
Wbd . For each localization we calculated two a priori error bounds εa(W ), εa∇ (W ) and four
a posteriori error bounds ε p(W ), εp∇ (W ), ε
pST (W ) and ε
pM (W ). We will pay special attention
Springer
Ann Oper Res (2006) 146:119–134 129
to the localizations Wbd and Wd . We conjecture that since only demands were aggregated,
the bounds calculated by allocating the reduced costs based on demands may be tighter and
may have a higher correlation with the actual error.
The dimension of the original problem allows the exact solution for all problems sizes.
Correlations between the error bounds and the actual error, z − z∗, were calculated using the
Pearson coefficient and are presented in Table 1 for all localizations. In all correlation tables to
be presented the p-values are less than 0.0001 if not indicated. In Table 1 the exceptions are:
in SET1, εa(Ws) (p = 0.0003); and in SET2, εa∇ (Ws) (p = 0.0005), εa(Wbs) (p = 0.1357),
ε p(Ws) (p = 0.2876), and εpST (Ws) (p = 0.3112).
Table 1 Correlation between the actual error and the error bounds
SET1 SET2
Wd Ws Wbd Wbs Wd Ws Wbd Wbs
εa(W) 0.6609 0.2929 0.7196 0.4441 0.5615 −0.4840 0.5617 0.1502
εa∇ (W) 0.7131 0.4043 0.7376 0.6471 0.5626 −0.3418 0.5624 0.5611
εp(W) 0.9826 0.5016 0.9864 0.5738 0.9985 −0.1074 0.9985 0.4239
εp∇ (W) 0.9897 0.9897 0.9910 0.7288 0.9998 0.9998 0.9998 0.6958
εpST(W) 0.9827 0.4979 0.9864 0.5753 0.9986 −0.1023 0.9986 0.4243
εpM(W) 0.9989 0.9765 0.9997 0.9660 1.0000 0.9419 1.0000 0.9086
The a priori error bounds calculated for the localization Wd and Wbd are correlated with
the actual error at a statistically significant level (p < 0.0001) for both SET1 and SET2,
and the highest correlation is achieved for the error bound εa∇ . The correlation coefficient is
smaller for the instances in SET2 than for the ones in SET1.
The a posteriori error bounds for SET1 are strongly and statistically significantly correlated
with the actual error for all localizations used. For the instances in SET2 the bounds ε p(Ws)
and εpST (Ws) are not correlated with the actual error. For both SET1 and SET2 the highest
correlation is achieved for the bounds εp∇ (W ) and ε
pM (W ) calculated for the localizations Wd
and Wbd . The error bounds calculated using localizations Ws and Wbs have lower correlation
values and for some error bounds the correlation value is not statistically significant.
The correlations between the a priori error bounds, εa∇ and εa , and the a posteriori error
bounds are presented in Table 2, with the values for εa∇ in bold. For the localizations Wd
and Wbd both a priori error bounds are correlated with the a posteriori error bounds, the
Table 2 Correlation between the a priori error bounds, εa∇(W) and εa(W), and the a posteriori
error bounds
SET1 SET2
Wd Ws Wbd Wbs Wd Ws Wbd Wbs
ε p(W) 0.7831 0.5937 0.7905 0.8207 0.5953 0.5289 0.5951 0.3776
0.7259 0.6513 0.7677 0.7695 0.5944 0.4141 0.5945 0.6970
εp∇ (W) 0.7382 0.5905 0.7582 0.8263 0.5743 −0.1365 0.5741 0.1677
0.6788 0.6319 0.7364 0.5913 0.5733 −0.1815 0.5734 0.0748
εpST(W) 0.7830 0.6080 0.7900 0.8218 0.5933 0.5267 0.5930 0.3781
0.7258 0.6383 0.7671 0.7680 0.5924 0.4071 0.5924 0.6972
εpM(W) 0.7176 0.4687 0.7402 0.7259 0.5617 −0.2326 0.5614 0.4849
0.6693 0.3719 0.7230 0.5418 0.5607 −0.3261 0.5606 0.2847
Springer
130 Ann Oper Res (2006) 146:119–134
correlations for the instances in SET2 are weaker than for the instances in SET1. For these
localizations εa∇ is more correlated with a the posteriori error bounds than εa . For the instances
in SET2, the correlation with εp∇ and ε
pM when using localizations Ws and Wbs are very low
and in some cases not significant (e.g., p-values for the correlation between εa∇ (Ws) and
εp∇ (Ws) and for εa(Wbs) and ε
p∇ (Wbs) are 0.1757 and 0.4596 respectively).
In Table 3 we present, sorted by aggregation level, the correlations between the actual
error and the error bounds calculated for localization Wbd , the tightest localization based on
demands. The a priori error bounds are highly and statistically significantly correlated with
the actual error for all levels of aggregation. The lowest correlation was obtained for level
5. The correlations for the a priori error bound εa∇ are higher (except for level 4 in SET1)
than for εa . Comparing with a priori error bounds, the a posteriori error bounds are more
correlated with the actual error for all levels of aggregation. The error bound εpM has the
strongest correlation which is almost insensitive to the aggregation level.
Table 3 Correlation between the actual error and the error bounds by level of aggregation (localization Wbd )
SET1 SET2
level 1 level 2 level 3 level 4 level 5 level 1 level 2 level 3 level 4 level 5
εa 0.6871 0.7660 0.6625 0.6297 0.5636 0.9849 0.9883 0.9080 0.9281 0.6920
εa∇ 0.7100 0.9647 0.8000 0.6281 0.5803 0.9849 0.9887 0.9101 0.9288 0.7261
εp 0.9858 0.9954 0.9837 0.9912 0.9557 0.9984 0.9992 0.9956 0.9786 0.9710
εp∇ 0.9863 0.9965 0.9867 0.9948 0.9663 0.9998 0.9997 0.9992 0.9974 0.9948
εpST 0.9858 0.9954 0.9835 0.9908 0.9556 0.9984 0.9992 0.9957 0.9783 0.9706
εpM 0.9997 0.9997 0.9995 0.9993 0.9992 0.9997 0.9995 0.9996 0.9993 0.9997
Similar to Leisten (1997) and Litvinchev and Rangel (1999), the numerical quality of the
error bounds was characterized by the upper and the lower bounds on z∗: z ≥ z∗ ≥ L Bε,
where L Bε = z − ε(W ) and ε(W ) is an error bound. We used the indicators (z − z∗)/z∗
and (z∗ − L Bε)/z∗ to represent the proximity ratio. In particular, the smaller the proximity
ratio (z∗ − L Bε)/z∗ is, the tighter is the error bound. The average values of these indicators,
together with standard deviations, are presented in Tables 4 and 7, sorted by localizations
and levels of clustering. In Table 5 we present the relative number of instances having W as
the best (bound minimizing) localization for a fixed type of error bound (in % of the total
number of instances). The localization Ws is not represented in Table 5 since it was never
the best. In Table 6 we give the relative number of instances having a particular combination
bound-localization as the best among all the error bounds calculated.
Note that by construction, for any fixed localization W we have εa(W ) ≥ εa∇ (W ) and
ε p(W ) ≥ max {ε pM (W ), ε
pST (W ), ε
p∇ (W )} since the improved error bounds εa
∇ (W ), εpM (W ),
εpST (W ) and ε
p∇ (W ) are at least as good as the original error bounds εa(W ) and ε p(W ), respec-
tively. To compare different localizations, observe that in Mendelssohn (1980)’s approach
we look for the new duals in the direction of the dual aggregated solution, while Shetty and
Taylor (1987) scheme uses the right-hand side vector of the original problem as the search
direction. In both cases the search direction is independent on the localization used. Hence,
εpM (Wbd ) ≤ ε
pM (Wd ) (ε
pST (Wbd ) ≤ ε
pST (Wd )) since Wbd ⊆ Wd and the objective functions to
calculate εpM (Wbd ) and ε
pM (Wd ) are the same. This observation is also valid for the local-
izations Wbs and Ws . But this is not the case for the gradient-based improvement scheme
used to obtain the bound εp∇ (W ). Here the dual search direction is defined by the optimal
solution x(W ) of the problem used to calculate the bound ε p(W ) and hence depends on the
localization used. Thus the objective functions used to calculate εp∇ (Wbd ) and ε
p∇ (Wd ) are not
Springer
Ann Oper Res (2006) 146:119–134 131
Table 4 Proximity ratio
(average
standard deviation
)SET1 SET2
(z − z∗)/z∗ 1.6137 1.0293
0.6463 0.4953
(z∗ − LBε)/z∗ Wd Ws Wbd Wbs Wd Ws Wbd Wbs
εa(W) 1.4870 7.1782 1.2440 5.4950 1.1410 10.9971 1.1377 3.4889
0.8922 5.9698 0.7905 4.5655 0.9503 13.1592 0.9481 1.7821
εa∇ (W) 1.2335 4.4308 1.1452 2.9979 1.1226 4.5688 1.1219 2.3655
0.8105 3.4014 0.7532 1.5581 0.9316 1.7838 0.9318 0.6100
ε p(W) 0.2707 9.2814 0.2076 5.3673 0.0495 6.0126 0.0493 1.9566
0.1610 7.7396 0.1349 4.4300 0.0352 6.8493 0.0351 1.8763
εp∇ (W) 0.1684 3.5182 0.1228 2.1924 0.0140 1.1395 0.0138 0.9779
0.1095 4.4342 0.0991 1.6776 0.0115 2.7697 0.0114 0.7742
εpST (W) 0.2706 8.9535 0.2068 5.3447 0.0481 5.7335 0.0478 1.9549
0.1609 7.4195 0.1348 4.4138 0.0346 6.5401 0.0344 1.8762
εpM (W) 0.1043 0.6292 0.0595 0.5476 0.0125 0.6359 0.0109 0.4226
0.0323 0.1434 0.0169 0.1756 0.0047 0.1673 0.0039 0.2197
Table 5 Relative number ofinstances (in %) having W as thebest localization
SET1 SET2
Wbd Wd Wbs Wbd Wd Wbs
εa(W) 98 0 2 100 0 0
εa∇ (W) 89.33 5.33 5.33 77 23 0
εp(W) 100 0 0 100 0 0
εp∇ (W) 98.67 1.33 0 94 6 0
εpST(W) 100 0 0 100 0 0
εpM(W) 100 0 0 100 0 0
the same and we cannot guarantee that εp∇ (Wbd ) ≤ ε
p∇ (Wd ). Similarly for the localizations
Wbs and Ws as well as for the a priori error bound εa∇ (W ).
It can be seen from Table 4 that the use of localizations Wd and Wbd give the lowest
average proximity ratios for the a priori error bounds and there is no statistically significant
difference between εa and εa∇ calculated using these localizations, according to the t-test run
to check the differences. However, for localizations Ws and Wbs , εa∇ gives a smaller proximity
ratio than εa . As was expected, the a posteriori error bounds are much tighter than a priori
error bouns, giving the best results for localizations Wd and Wbd . Comparing the different
types of the error bounds, εpM and ε
p∇ are tighter than ε p and ε
pST for any fixed localization,
with the best proximity ratio obtained for εpM (Wbd ). The second best is ε
p∇ (Wbd ).
From Table 5 we can see that there are a few instances in SET1 to which the localization
Wbs results in tighter a priori error bounds than Wbd and Wd . For the a priori error bounds
calculated in SET2 and all types of a posteriori error bounds the tightest results were obtained
for localizations Wbd and Wd . For the a posteriori error bounds ε p(W ), εpST (W ), and ε
pM (W )
the localization Wbd was the best for all instances. This is not the case for the gradient-based
error bounds εa∇ (W ) and ε
p∇ (W ) where for some instances the localization Wd provides better
results, especially in SET2. Comparing the best error bounds in Table 6 we see that for the
Springer
132 Ann Oper Res (2006) 146:119–134
Table 6 Relative number ofinstances (in %) having a givencombination (error bound,localization) as the best
SET1 SET2
εa∇ (Wbd ) 89.33 77.00
εa∇ (Wd ) 5.33 23.00
εa∇ (Wbs ) 5.33 0
εpM(Wbd ) 90.67 48.00
εp∇ (Wbd ) 9.33 49.00
εp∇ (Wd ) 0 3.00
Table 7 Proximity ratio by level of aggregation (localization Wbd )
(average
standard deviation
)SET1 SET2
level 1 level 2 level 3 level 4 level 5 level 1 level 2 level 3 level 4 level 5
(z − z∗)/z∗ 1.0332 1.7219 1.3672 1.6523 2.2939 0.7235 1.4655 0.3400 0.9615 1.6557
0.5283 0.5319 0.5210 0.4520 0.4527 0.1196 0.1189 0.0869 0.0966 0.1257
(z∗ − LBε)/z∗
εa 0.5469 0.4652 1.4389 1.9824 1.7865 0.0695 0.0588 1.2281 2.2021 2.1301
0.5043 0.4212 0.4980 0.4607 0.5297 0.0263 0.0207 0.0668 0.0667 0.0940
εa∇ 0.5092 0.3213 1.2852 1.9223 1.6881 0.0695 0.0586 1.2237 2.1723 2.0854
0.4757 0.1522 0.3354 0.4587 0.5213 0.0263 0.0202 0.0666 0.0683 0.0895
εp 0.0975 0.1350 0.2040 0.2873 0.3141 0.0244 0.0339 0.0213 0.0669 0.1000
0.1061 0.0678 0.1067 0.0603 0.1621 0.0074 0.0053 0.0082 0.0228 0.0312
εp∇ 0.0869 0.0927 0.1255 0.1468 0.1619 0.0061 0.0071 0.0074 0.0199 0.0286
0.1030 0.0579 0.0983 0.0464 0.1422 0.0026 0.0030 0.0035 0.0077 0.0130
εpST 0.0975 0.1348 0.2028 0.2851 0.3138 0.0244 0.0339 0.0199 0.0633 0.0977
0.1061 0.0679 0.1073 0.0614 0.1623 0.0074 0.0053 0.0082 0.0236 0.0314
εpM 0.0469 0.0583 0.0610 0.0653 0.0663 0.0110 0.0132 0.0072 0.0105 0.0127
0.0134 0.0123 0.0163 0.0170 0.0182 0.0029 0.0037 0.0025 0.0037 0.0040
relatively small problems in SET1 the error bound εpM (Wbd ) outperforms the bound ε
p∇ (Wbd )
for most, but not all instances. In SET2 of relatively large problems the gradient-based bound
εp∇ provided tightest results slightly more frequently than Mendelssohn’s bound ε
pM .
The proximity ratios as a function of clustering are shown in Table 7 for localization Wbd .
Concerning the a priori error bounds, εa∇ is slightly tighter than εa . The error bounds do
not increase monotonously with respect to the clustering level. The a priori error bounds for
levels 1 and 2 are significantly tighter than for the highly aggregated instances in levels 4 and
5. This holds for both SET1 and SET2. Comparing the a posteriori error bounds, εpM and ε
p∇
have the lowest proximity ratio with εp∇ outperforming ε
pM only for levels 1 and 2 in SET2.
The bound εpM is very stable, while ε
p∇ increases with the clustering level.
To summarize:
� The a posteriori error bounds εpM , ε
p∇ are highly and statistically significantly correlated
with the actual error and are the tightest error bounds for all instances and localizations.� εpM is less sensitive than ε
p∇ to the level of clustering and problem dimension, and it is
tighter than εp∇ for most instances (90%) in SET1, while only for 48% of instances in SET2
of larger problems.
Springer
Ann Oper Res (2006) 146:119–134 133� Localization Wbd gives the tightest results for all, but a few, a posteriori error bounds. The
exceptions are for the error bound εp∇ where Wd performs better for (3%) of the instances
in SET2.� For localizations Wbd and Wd the a priori error bounds εa∇ and εa are statistically signifi-
cantly correlated with the actual error and with all the a posteriori error bounds.� For all instances εa∇ provides the best results for localizations Wbd and Wd , except for
5.33% of the instances in SET1 to which εa∇ (Wbs) performs better.
4. Conclusions
In this paper we derived new a priori and a posteriori aggregation error bounds for the
GTP and studied the relationships between the error bounds and the actual error. Only
customers (destinations) were aggregated and the clustering level was the only element of
the aggregation strategy that was varied in the numerical study.
Our numerical study demonstrates that the localization Wbd almost always performs bet-
ter, resulting in higher correlations between the error bounds and the actual error, and giving
tighter error bounds. This was expected, since in our case the error is due to incomplete
(aggregated) information in the aggregated problem about particular demands. Thus, us-
ing all demand constraints explicitly, as in Wbd and Wd , may provide tighter estimation.
Moreover, Wbd ⊆ Wd and strengthening the localization improves the bound. It can be seen
from the numerical results that strengthening the localization also increases the correlation
between the error bound and the actual error. The new a priori bound εa∇ (Wbd ) is signifi-
cantly correlated with the actual error (α = 0.05). That is, if the customers in the GTP are
aggregated using the strategy outlined in this study, and the prescribed localization is cho-
sen, the a priori error bound can be considered as an appropriate guide to use in selecting
the clustering level. Constructing various (including nonlinear, say, ellipsoidal (Litvinchev
and Rangel, 1999) localizations for structured problems is an interesting topic for future
research.
Comparing different techniques to calculate the a posteriori error bounds, surprisingly
good results were provided by Mendelssohn (1980)’s approach. We would like to stress that
this is not the case for unstructured problems. In particular, for the continuous knapsack prob-
lems, the approach of Shetty and Taylor (1987) gives a tighter error bound than Mendelssohn
(1980)’s one as reported in Leisten (1997) and Litvinchev and Rangel (1999). The good
performance of Mendelssohn (1980)’s bound for the GTP needs to be further investigated
and explained.
In summary, for our set of randomly generated problems, the aggregated model having
the tightest a priori error bound εa∇ (Wbd ) tends to give the best approximation of the optimal
objective for the non-aggregated model. After the aggregated model (the level of clustering)
was selected by the a priori error bound and solved, Mendelssohn (1980)’s a posteriori error
bound, εpM , as well as the gradient-based error bound, ε
p∇ , provide a good quantitative measure
of the actual error.
Acknowledgments The first author was partially supported by Russian foundation RFBR (02-01-00638)and Mexican foundation PAICyT(CA857-04), while the second author was partially funded by Brazilianfoundations FAPESP (2000/09971-9) and FUNDUNESP (00509/03-DFP). The authors thank the anonymousreferees for a careful reading of the article and helpful suggestions. Thanks are also due to Adriana B. Santosfor useful comments related to the Numerical Study section.
Springer
134 Ann Oper Res (2006) 146:119–134
References
Ahuja, R.K., T.L. Magnanti, and J.B. Orlin. (1993). Network Flows: Theory, Algorithms and Applications,Prentice Hall, NY.
Evans, J.R. (1979).“Aggregation in the Generalized Transportation Problem.” Computers & Operations Re-search, 6, 199–204.
Francis, R.L., T.J. Lowe, and A. Tamir. (2002). “Demand Point Aggregation for Location Models.” In DreznerZ. and Hamacher H.W (eds.), Facility Location: Applications and Theory, Springer, pp. 207–232.
Glover, F., D. Klingman, and N. Phillips. (1990). “Netform Modeling and Applications,” Interfaces, 20(4),7–27.
Hallefjord, A., K.O. Jornsten, and P. Varbrand (1993) “Solving Large Scale Generalized AssignmentProblems—An Aggregation/Disaggregation Approach.” European Journal of Operational Research, 64,103–114.
Klingman, D., A. Napier, and J. Stutz. (1974). “NETGEN: A Program for Generating Large Scale CapacitatedAssignment, Transportation, and Minimum Cost Flow Problems.” Management Science, 20, 814–821.
Leisten, R.A. (1997). “A Posteriori Error Bounds in Linear Programming Aggregation.” Computers & Oper-ations Research, 24, 1–16.
Litvinchev, I. and S. Rangel. (2002). “Comparing Aggregated Generalized Transportation Models Using ErrorBounds.” Technical Report TR2002-02-OC, UNESP, S.J. Rio Preto, Brazil. (Downloadable from websitehttp://www.dcce.ibilce.unesp.br/˜socorro/aggregation/gtp 2002/).
Litvinchev, I. and S. Rangel. (1999). “Localization of the Optimal Solution and a Posteriori Bounds forAggregation.” Computers & Operations Research, 26, 967–988.
Litvinchev, I. and V. Tsurkov. (2003). “Aggregation in Large Scale Optimization.” Applied Optimization, 83,Kluwer, NY.
Mendelsohn, R. (1980). “Improved Bounds for Aggregated Linear Programs.” Operations Research, 28,1450–1453.
Montgomery, D.C. (2001). Design and Analysis of Experiments, 5th edition, John Wiley & Sons INC.Norman, S.K., D.F. Rogers, and M.S. Levy. (1999). “Error Bound Comparisons for Aggrega-
tion/Disaggregation Techniques Applied to the Transportation Problem.” Computers & Operations Re-search, 26, 1003–1014.
Rogers, D.F., R.D. Plante, R.T. Wong, and J.R. Evans. (1991). “Aggregation and Disaggregation Techniquesand Methodology in Optimization,” Operations Research, 39, 553–582.
Shetty, C.M. and R.W. Taylor. (1987). “Solving Large-Scale Linear Programs by Aggregation.” Computers &Operations Research, 14, 385–393.
Zipkin, P.H. (1980a). “Bounds for Aggregating Nodes in Network Problems.” Mathematical Programming,19, 155–177.
Zipkin, P.H. (1980b). “Bounds on the Effect of Aggregating Variables in Linear Programs.” OperationsResearch, 28, 403–418.
Springer