CP-Based Local Branching

15
CP-based Local Branching Zeynep Kiziltan 1 , Andrea Lodi 2 , Michela Milano 2 , and Fabio Parisini 2 1 Department of Computer Science, University of Bologna, Italy. [email protected] 2 D.E.I.S., University of Bologna, Italy. {alodi,mmilano}@deis.unibo.it [email protected] Abstract. We propose the integration and extension of the local branch- ing search strategy in Constraint Programming (CP). Local branching is a general purpose heuristic method which searches locally around the best known solution by employing tree search. It has been successfully used in MIP where local branching constraints are used to model the neighborhood of an incumbent solution and improve the bound. The integration of local branching in CP is not simply a matter of imple- mentation, but requires a number of significant extensions (concerning the computation of the bound, cost-based filtering of the branching con- straints, diversification, variable neighbourhood width and search heuris- tics) and can greatly benefit from the CP environment. In this paper, we discuss how such extensions are possible and provide some experimental results to demonstrate the practical value of local branching in CP. 1 Introduction Local branching is a successful search strategy in Mixed-Integer Programming (MIP) [7]. It is a general purpose heuristic method which searches locally around the best known solution by employing tree search. Local branching has similar- ities with limited discrepancy search [5] and local search. The neighbourhoods are obtained by linear inequalities in the MIP model so that MIP searches for the optimal solution within the Hamming distance of the incumbent solution. The linear constraints representing the neighbourhood of incumbent solutions are called local branching constraints and are involved in the computation of the problem bound. Local branching is a general framework to explore effectively suitable solution subspaces, making use of the state-of-the-art MIP solvers. Even though the framework aims to improve the heuristic behaviour of MIP solvers, it offers a complete method which can be integrated in any tree based search. In this paper, we propose the integration and extension of the local branching framework in Constraint Programming (CP). The main motivation is to com- bine the power of constraint propagation and problem bound with a local search method which is potentially complete and applicable to any optimisation prob- lem. We expect the following advantages over a pure CP-based or local search for solving difficult optimisations problems: (1) to obtain better solutions within a time limit; (2) to improve solution quality quicker; (3) to prove optimality; (4) to prove optimality quicker.

Transcript of CP-Based Local Branching

CP-based Local Branching

Zeynep Kiziltan1, Andrea Lodi2, Michela Milano2, and Fabio Parisini2

1 Department of Computer Science, University of Bologna, [email protected]

2 D.E.I.S., University of Bologna, Italy. {alodi,mmilano}@[email protected]

Abstract. We propose the integration and extension of the local branch-ing search strategy in Constraint Programming (CP). Local branchingis a general purpose heuristic method which searches locally around thebest known solution by employing tree search. It has been successfullyused in MIP where local branching constraints are used to model theneighborhood of an incumbent solution and improve the bound. Theintegration of local branching in CP is not simply a matter of imple-mentation, but requires a number of significant extensions (concerningthe computation of the bound, cost-based filtering of the branching con-straints, diversification, variable neighbourhood width and search heuris-tics) and can greatly benefit from the CP environment. In this paper, wediscuss how such extensions are possible and provide some experimentalresults to demonstrate the practical value of local branching in CP.

1 Introduction

Local branching is a successful search strategy in Mixed-Integer Programming(MIP) [7]. It is a general purpose heuristic method which searches locally aroundthe best known solution by employing tree search. Local branching has similar-ities with limited discrepancy search [5] and local search. The neighbourhoodsare obtained by linear inequalities in the MIP model so that MIP searches forthe optimal solution within the Hamming distance of the incumbent solution.The linear constraints representing the neighbourhood of incumbent solutionsare called local branching constraints and are involved in the computation ofthe problem bound. Local branching is a general framework to explore effectivelysuitable solution subspaces, making use of the state-of-the-art MIP solvers. Eventhough the framework aims to improve the heuristic behaviour of MIP solvers,it offers a complete method which can be integrated in any tree based search.

In this paper, we propose the integration and extension of the local branchingframework in Constraint Programming (CP). The main motivation is to com-bine the power of constraint propagation and problem bound with a local searchmethod which is potentially complete and applicable to any optimisation prob-lem. We expect the following advantages over a pure CP-based or local searchfor solving difficult optimisations problems: (1) to obtain better solutions withina time limit; (2) to improve solution quality quicker; (3) to prove optimality; (4)to prove optimality quicker.

Integrating local branching in CP is not simply a matter of implementation.It requires a number of substantial modifications to the original search strategyas we discuss in the paper. First, using a linear programming solver for comput-ing the bound of each neighborhood is not computationally affordable in CP.We have therefore studied a lighter way to compute the bound of the neighbor-hood which is efficient, effective and incremental, using the additive boundingtechnique. Second, we have developed a cost-based filtering algorithm for thelocal branching constraint by extracting valid reduced-costs out of the additivebounding procedure. Third, we have studied several techniques so as to diversifythe search and analyzed different ways to enlarge the neighborhood when theneighborhood exploration is no longer effective. Finally, we have designed generalsearch heuristics applicable in the context of local branching. This last point willnot be discussed in detail in the paper for reasons of space. Our experimentalresults demonstrate the practical value of integrating local branching in CP.

2 Formal Background

A Constraint Satisfaction Problem (CSP) has a set of variables X = [X1, ..., Xn],each with a finite domain D(Xi) of values, and a set of constraints specifying al-lowed combinations of values for subsets of variables. A solution X = [X1, ..., Xn]is an assignment of X satisfying the constraints. Constraint programming (CP)typically solves a CSP by exploring the space of partial assignments using a back-track tree search, enforcing a local consistency property using either specialisedor general purpose propagation algorithms. A CSP is a constraint optimisationproblem when we are interested in the optimal solution. Such a problem has acost variable C defining the cost of the solution. In the rest, we focus on min-imisation problems with C =

∑i∈{1,...,n} C(i,Xi) where C(i, v) is the cost of

assigning v to Xi (cost-matrix).In optimisation problems, even if a value in a domain can participate in a

feasible solution, it can be pruned if it cannot be a part of a better solution(cost-based filtering [9]). CP makes use of branch-and-bound search for findingthe optimal solution, hence computing tight lower bounds for C is essentialto speed up search. We can exploit the additive bounding procedure [8] forcomputing tight lower bounds. This technique works using reduced-costs whichare computed as a result of the solution of a linear program. A reduced-cost isassociated to a variable and represents the additional cost to pay to the optimalsolution when it becomes part of the optimal solution.

3 Local Branching Search Strategy

3.1 Local Branching in MIP

Local branching is a successful search method proposed for MIP [7]. The idea isthat, given a reference solution to an optimisation problem, its neighbourhood issearched with the hope of improving the solution quality, using tree search. After

mm

mm

m

%%

%%

%%

ee

ee

ee

%%

%%

%%

ee

ee

ee

m

me

ee

eee

��

� TT

T

��

� TT

T ��

� TT

T

%%

%%

%%

��

� TT

T

1

2 3

4 5

6 7

∆(X, X1) ≤ k ∆(X, X1) ≥ k + 1

∆(X, X2) ≤ k ∆(X, X2) ≥ k + 1

∆(X, X3) ≤ k ∆(X, X3) ≥ k + 1

T

improved solution X2

T

improved solution X3

T

no improved solution

T

Fig. 1. The basic local branching framework.

the optimal solution in the neighbourhood is found, the incumbent solution isupdated to this new solution and the search continues from this better solution.The basic machinery of the framework is depicted in Figure 1.

Assume we have a tree search method for solving an optimisation problemP whose constrained variables are X. For a given positive integer parameter k,the k-OPT neighbourhood N (X, k) is the set of feasible solutions of P satisfyingthe additional local branching constraint ∆(X, X) ≤ k. This constraint definesthe neighbourhood of a reference solution X by the space of assignments whichhave at most k different values. It ensures that the sought assignment of X haveat most k different values than X. The neighbourhood is thus measured by theHamming distance w.r.t. X and is used as a branching criteria within the solvingmethod of P . Given X, the solution space associated with the current branchingnode is partitioned by means of the disjunction ∆(X, X) ≤ k ∨ ∆(X, X) ≥ k+1.In this way, the whole search space is divided into two, and thus exploring eachpart exhaustively would guarantee completeness. The neighbourhood structureis general-purpose in the sense that it is independent of the problem being solved.

In Figure 1, each node of the resulting search tree is labelled with a number.The triangles marked by the letter T correspond to the branching subtrees tobe explored. Each subtree is generated using the conjunction of the branchingconstraints shown on the branches of the tree.

Assume we have a starting incumbent solution X1 at the root node 1. Theleft branch node 2 is created via the branching constraint ∆(X, X1) ≤ k. It cor-responds to the optimization within the k-OPT neighbourhood N (X1, k) whichyields to the optimal solution X2. This solution now becomes the new incum-bent solution. The scheme is then re-applied to the right branch node 3. Notethat, since we completely explored N (X1, k), we create node 3 after revers-

ing the initial branching constraint to ∆(X, X1) ≥ k + 1. Now we create theleft branch node 4 via ∆(X, X2) ≤ k, which corresponds to the optimizationwithin N (X2, k) − N (X1, k). The new optimization gives the new incumbentsolution X3. Node 6 is then addressed, which corresponds to the initial problemP amended by the three constraints ∆(X, X1) ≥ k + 1, ∆(X, X2) ≥ k + 1 and∆(X, X3) ≤ k. In the example, node 6 produces a subproblem which contains noimproving solution. In this case, we can stop so as to behave in a hill climbingfashion. Alternatively, we can add ∆(X, X3) ≥ k + 1 and generate the rightbranch node 7, which is then explored using the same search method as of thesubtrees. This makes local branching a complete search strategy.

In [7], branch-and-bound MIP search is used as the tree search method, andthus to search the subtrees. The neighbourhoods are obtained through linearinequalities in the MIP model. The use of linear constraints in the MIP modelclearly provides a tighter bound w.r.t. the one of the original problem.

3.2 CP-based Local Branching

The local branching framework is not specific to MIP. It can be integrated inany tree search. We therefore suggest the use of general-purpose neighbourhoodstructures within a tree search with constraint propagation.

Like in MIP, neighbourhoods can be constructed via additional constraints inthe original model and such constraints can be used both to tighten the problembound and for filtering purposes. Since this is a new constraint problem, CP candirectly be used to find the initial feasible solution and search the neighbourhood,exploiting the power of constraint propagation. Thus, complex constraints canbe solved and infeasible solutions in the neighbourhoods can be detected effec-tively by constraint propagation, which are difficult tasks for a pure local searchalgorithm. Moreover, solution quality improves quickly by local search aroundthe best found solution, which we miss in the classical CP framework. Further-more, the use of tree search for performing local search guarantees completenessunlike classical local search algorithms. Consequently, we expect the followingadvantages over a pure CP-based or local search: (1) to obtain better solutionswithin a time limit; (2) to improve solution quality quicker; (3) to prove opti-mality; (4) to prove optimality quicker. And finally, the technique is general andapplicable to any optimisation problem modelled with constraints in contrast tomany local search techniques designed only for specific problems.

We stress that we do not propose a mere implementation of local branch-ing in CP. We significantly change and extend the original framework so as toaccommodate it in the CP machinery smoothly and successfully. The followingsummarises our contributions.

Bound of the neighborhood Using a linear programming solver for comput-ing the bound of each neighborhood is not computationally affordable inCP. We have therefore studied a lighter way to compute the bound of theneighborhood which is efficient, effective and incremental.

Cost-based filtering Cost-based filtering can be applied when reduced-costsare available. Its use in CP is more aggressive than in MIP where it is mainlyapplied during preprocessing (variable-fixing). We show how we can extractvalid reduced-costs from the additive bounding procedure.

Restarting In [7], diversification techniques are applied when for instance thesubtree rooted at the left node is proved to contain no improving solutions(node 6 in Figure 1), instead of simply reversing the last local branchingconstraint and exploring the rest of the tree (node 7) using the same searchmethod as of the subtrees. We propose a new diversification scheme whichfinds a new feasible solution at node 7 by keeping all the reversed branchingconstraints and restarts local branching from the new incumbent solutionusing the regular local branching framework.

Discrepancy The parameter k is essential to the success of the method as itdefines the size of the neighborhood. We propose two uses of k: (1) the tra-ditional one, with k being fixed for each neighborhood; (2) the variable one,with k ranging between k1 and k2. In this way, the neighborhood definedby k1 is searched first. While no improving solution is found, the neighbor-hood is enlarged up until k2-discrepancy subtree is completely explored. Inaddition, the neighborhoods are visited in slices. Instead of posting the con-straint ∆(X, X) ≤ k, we impose k constraints of the form ∆(X, X) = j withj = 1 . . . k. We show why this is important for the computation of the bound.

Heuristics As search proceeds in local branching, we gather a set of incumbentsolutions X1, ..., Xm which are feasible and which get closer to the optimalsolution of the problem. We can exploit these incumbent solutions as a guideto search the subtrees. Being independent of the specific problem solved, suchheuristics would provide us with a general method suitable for this searchstrategy. We have developed a number of useful heuristics, however we willnot discus them here for reasons of space.

4 Additive Bounding for Local Branching

To integrate local branching in CP, we need a local branching constraint ∆(X, X) ≤k, where X is an assignment of X representing the current reference solution andk is an integer value. The constraint holds iff the sought assignment of X hasat most k different values than X. This constraint enforces that the Hammingdistance between X and X is at most k. The CP model we consider is:

min∑i∈N

CiXi(1)

subject to AnySide cst(X) (2)∆(X, X) ≤ k (3)Xi ∈ D(Xi) ∀i ∈ N (4)

where CiXiis the cost of assigning variable Xi, AnySide cst(X) is any set of

constraints on domain variables X. The purpose of this section is to exploitthe delta constraint (3), that implicitly represents the structure of the explored

tree, to tighten the problem bound and to perform cost based filtering [9]. To thispurpose, we have developed a local branching constraint lb cst(X, X, k, C) whichcombines together the ∆ constraint and the cost function C = min

∑i∈N CiXi

.Using this constraint alone provides very poor problem bounds, and thus poorfiltering. If instead we recognize in the problem a combinatorial relaxation Relwhich can be solved in polynomial time and provide a bound LBRel and a setof reduced-costs c, we can feed the local branching constraint with c and obtainan improved bound in an additive bounding fashion.

Additive bounding is an effective procedure for computing bounds for opti-misation problems [8]. It consists in solving a sequence of relaxations of P , eachproducing an improved bound. Assume, we have a set of bounding proceduresB1, ..., Bm. We write Bi(c) for the ith bounding procedure when applied to aninstance of P with cost matrix c. Each Bi returns a lower bound LBi and areduced-cost matrix c. This cost matrix is used by the next bounding procedureBi+1(c). The sum of the bounds

∑i∈{1,...,m} LBi is a valid lower bound for P . An

example of the relaxation Rel is the Assignment Problem if the side constraintscontain an alldifferent [9]. Another example is a Network Flow Problem if theside constraints contain a global cardinality constraint. Clearly, a tighter boundcan always be found feeding an LP solver with the linear relaxation of the wholeproblem, including the local branching constraints, as done in [7]. We have ex-perimentally noticed that this relaxation is too expensive in computational timeto be used in a CP setting.

To explain how additive bounding can improve the bound obtained by thelocal branching constraints, we use the usual mapping between CP and 0-1 Inte-ger Programming (IP) models. Note that this model is useful to understand thestructure of the problems we are considering, but it is not solved by an IP solver.We devise a special purpose linear time algorithm as we explain next. We definea directed graph G = (N,A) where arc (i, j) ∈ A iff j ∈ D(Xi) where D(Xi) ⊆ Nand N = {1, . . . , n}. The corresponding IP model contains xij variables and xas the IP equivalent variables and X. The additive bounding procedure usestwo relaxations which correspond to Rel and Loc Branch that considers thelb cst(X, X, k, C). The solution of Rel produces the optimal solution X∗, withvalue LBRel and a reduced-cost matrix c. We can feed the second relaxationLoc Branch with the reduced-cost matrix c from Rel and we obtain:

LBLoc Branch = min∑i∈N

∑j∈N

cij xij (5)

s.t.∑

(i,j)∈S

(1− xij) ≤ k S = {(i, j)|xij = 1} (6)

xij ∈ {0, 1}, ∀(i, j) ∈ A (7)

To have a tighter bound, we divide this problem into k problems correspond-ing to each discrepancy from 1 to k and transforming the constraint (6) into∑

(i,j)∈S(1 − xij) = d with d = 1 to k. The optimal solution of each of thesesubproblems can be computed as follows. We start from X = [X1, ..., Xn] andthen extract and sort in non-decreasing order the corresponding reduced-costs

csorted = sort(c1X1, ..., cnXn

). The new bound LBLoc Branch is the sum of thefirst n − d smallest reduced-costs from csorted. Overall, a valid bound for theproblem is LB = LBLoc Branch + LBRel. The use of the additive bounding hereis different from its use in [11], since we are starting from a reference solution Xwhich is not the optimal solution X∗ computed by the first relaxation.

As an example, consider we have [X1, ..., X5] each with the domain {1, 2, 3, 4, 5}.Assume the AP solution is [1, 2, 3, 4, 5] with the optimal value LB1 and with thereduced-cost matrix c in which cii = 0 (corresponding to the optimal solution)and cij = j for all j 6= i for all i ∈ {1, 2, 3, 4, 5}. Consider an initial assignmentX = [2, 5, 4, 1, 3] and that we have d = 3, which means the new solution musthave 3 different assignments with respect to X. Changing 3 variables means keep-ing 2 variable assignments the same as in X. We first extract the reduced-costsof the values in this assignment, which is [2, 5, 4, 1, 3]. These values tell us howmuch extra cost we would pay if we assigned Xi the value in Xi. Then we sortthis array csorted = [1, 2, 3, 4, 5]. Since we need to keep 2 variable assignments ofX the same, we choose those that give us the minimum increase in cost. So, wetake the first 2 reduced-costs in csorted, and add their sum (3) to LB1 to obtaina valid lower bound. Note that, the remaining 3 variables can be dropped to 0.Hence, the changes do not contribute to the computation of the bound.

4.1 Cost-based Filtering

After computing the lower bound LB in an additive way as LB = LB1 + LB2,we need an associated reduced-cost matrix so as to apply cost-based filtering [9].To do this, we have to compute the reduced-cost matrix of the problem:

LB2 = min∑

(i,j)∈S

cij xij (8)

∑(i,j)∈S

xij = n− d (9)

xij ∈ [0, 1], ∀(i, j) ∈ S (10)

This problem is a simpler version of problem (5)-(7)3, obtained by removing vari-ables xij ,∀(i, j) 6∈ S, and relaxing the integrality requirement on the remainingvariables. It is easy to see that the two problems are indeed equivalent. First,the integrality conditions are redundant as shown by the algorithm presentedin Section 4 to solve the problem. Second, since the variables xij , (i, j) 6∈ S donot contribute to satisfy constraint (9) they can be dropped to 0 because theirreduced-costs are non-negative. The dual of such a linear program is as follows:

LB2 = max(n− d)π0 +∑

(i,j)∈S

πij (11)

π0 + πij ≤ cij , ∀(i, j) ∈ S (12)πij ≤ 0, ∀(i, j) ∈ S (13)

3 More precisely, it is a simpler version of its d-th subproblem in which constraint (6)is written in equality form.

Let us partition set S into Smin containing the n − d pairs (i, j) having thesmallest reduced-costs and Smax containing the remaining d pairs. We also definecmax = max(i,j)∈Smin cij . Then, it is not difficult to prove the following result.

Theorem 1. The dual solution

. π0 = cmax

. πij = 0,∀(i, j) ∈ Smax

. πij = cij − π0,∀(i, j) ∈ Smin

is optimal for system (11)-(13).

Proof. The above solution is trivially feasible since for (i, j) ∈ Smax all reduced-costs are by definition greater or equal to the largest one in Smin and for (i, j) ∈Smin we have an identity. Thus, both (12) and (13) are satisfied. Moreover, thesolution is also optimal. Indeed, the cardinality of Smin is n − d, the only dualvariables contributing to the objective function are the ones in Smin (plus ofcourse π0), thus the first term disappears and the value becomes

∑(i,j)∈Smin

cij .But such a value is also the one of system (8)-(10) because the variables in Smin

are the ones set to 1.2

We can now construct the reduced-cost matrix c associated with the optimalsolution of system (8)-(10):

. cij = 0,∀(i, j) ∈ Smin

. cij = cij − cmax,∀(i, j) ∈ Smax

Finally, the reduced-costs of the variables xij , (i, j) 6∈ S do not change, i.e.,cij = cij .

4.2 Incremental Computation of the Bound

While exploring the subtrees, decisions are taken on variable instantiation andas a consequence constraints are propagated and some domains are pruned. Wecan therefore update the problem lower bound by taking taking into account themost recent situation of the domains. The challenge is to do this in an incrementalway, not incurring much extra overhead. The first bound LB1 can be updatedin the traditional way: each time a value belonging to the optimal AP solutionis removed, the AP is recomputed with an O(n2) algorithm (which consists ina single augmenting path) [9]. Using the new AP solution, we can update thesecond bound LB2 in a simple way by using some special data structures.

Consider the sets Smin and Smax defined previously. More precisely, we con-sider a list Smax initially containing d ordered pairs (i, j) corresponding to thevariable-value assignment of the d greatest reduced-costs from csorted and Smin

containing the n − d ordered pairs (i, j) corresponding to the n − d smallestreduced-costs from csorted. Whilst Smin contains the assignments (variable in-dex - value) that should remain the same w.r.t. X since they have the smallestreduced-costs, Smax contains the assignments that should change w.r.t. X andthat conceptually assume a value corresponding to a 0 reduced-cost. Note thatinitially there are n pairs whose first index goes from 1 to n in Smin ∪ Smax.

Assignment of j to Xi We distinguish four cases:

(i, j) ∈ Smax: A variable that was supposed to change w.r.t. X is insteadassigned the value in X. We should update Smin and Smax by maintain-ing them ordered, as well as LB2: 1) remove (i, j) from Smax; 2) remove(h, k) = max(m,n)∈Smin cmn from Smin and add it ordered in Smax; 3) LB2 =LB2 + cij − chk.

(i, k) ∈ Smax with k 6= j: A variable that was supposed to change w.r.t. X, in-deed changes. In the bound, this variable was assuming a value correspondingto a 0 reduced-cost, while now it assumes the value j whose reduced-cost cij

may or may not be 0. We should update LB2: LB2 = LB2 + cij .(i, j) ∈ Smin: No changes are necessary because a variable that was supposed

to remain the same w.r.t. X remains the same.(i, k) ∈ Smin with k 6= j: A variable that was to remain the same w.r.t. X

instead changes. We should update Smin and Smax as well as LB2: 1) remove(h, k) = min(m,n)∈Smax cmn from Smax and insert it in the last position ofSmin; 2) remove (i, k) from Smin; 3) LB2 = LB2 − cij + chk.

Removal of j from D(Xi) The only important case is when (i, j) ∈ Smin,where a variable that was supposed to remain the same w.r.t. X and that con-tributed to the computation of the bound changes. We assume Xi changesto a value corresponding to the smallest reduced-cost (possibly 0) whose in-dex is k, and then update Smin and Smax as well as LB2: 1) remove (h, k) =min(m,n)∈Smax cmn from Smax and insert it in the last position of Smin; 2) remove(i, j) from Smin; 3) LB2 = LB2+chk−cij . That is, the number of variables whichmust change is decreased from from d to d− 1.

5 Diversification and Intensification

The heuristic behaviour of local branching can be enhanced by exploiting thewell known diversification techniques of metaheuristics. In [7], diversification isapplied when for instance the current left node is proved to contain no improvingsolutions. This arises at node 6 in Figure 1. Instead of simply reversing the lastlocal branching constraint and switching to the exploration of node 7 usingCP, we propose to restart local branching as follows. At node 7, we keep all thebranching constraints ∆(X, X1) ≥ k+1, ∆(X, X2) ≥ k+1 and ∆(X, X3) ≥ k+1that we would have in the regular framework. We now find a new feasible solutionX1

′ which is enforced to be better than X3. Then, we restart local branchingfrom the new incumbent solution X1

′ using the regular framework. As X ′1 has a

better cost than X1 and the search space is now reduced by the three reversedbranching constraints, we expect local branching to be closer to find the optimalsolution. Restarting could be a good idea if switching to CP in node 7 would notgive us any benefit of local branching. That is, it might be too early to finishlocal branching if the optimal solution is still difficult to find. But the reversecould also happen. Restarting could only be a waste of time if CP is in a positionof proving optimality quickly. We therefore propose conditional restarting. We

could for instance impose a search limit of 1000 fails on node 7 as a measureof “hardness of problem solving”. If during this time optimality is not proven,then we could switch to the restarting scheme as explained above. This wouldbalance the benefits and the effort of applying local branching.

Alternatively, we could diversify the search in the spirit of variable neigh-bourhood search [14], by restarting local branching without any upper boundconstraint. The new feasible solution X ′

1 could have a worse cost than the in-cumbent solution X3 (and even worse than X1). The expected gain is that X ′

1

could be closer to the optimal solution in terms of Hamming distance.Another approach to enhance the heuristic behaviour is to intensify search

by enlarging the neighbourhood. Again consider node 6 in Figure 1. After it isproved to contain no improving solutions, we could consider searching the neigh-bourhood N (X, k + 1) by replacing the branching constraint ∆(X, X3) ≤ kwith ∆(X, X3) = k + 1. If an improving solution X4 is now found, we con-tinue local branching as before by reversing the current branching constraint to∆(X, X3) ≥ k + 2. Otherwise, we can try enlarging the neighbourhood furtheruntil some N (X, k + p) is completely explored.

The diversification and intensification methods could be hybridised. For in-stance, we could interleave enlarging the neighbourhood with restarting in node6 as follows. First, enlarge the neighbourhood by replacing ∆(X, X3) ≤ k with∆(X, X3) = k + 1. If this new neighbourhood also contains no improving solu-tions, then reverse the branching constraint to ∆(X, X3) ≥ k + 2, find a newfeasible solution X ′

1 (either with or without the upper bound constraint) andrestart local branching from this new incumbent solution. Note that these diver-sification and intensification techniques are not considered in [7].

6 Experimental Results

In this section, we provide some experimental results to support the expectedbenefits of local branching in CP as stated in Section 3.2. Experiments are con-ducted using ILOG Solver 6.0 and ILOG Scheduler 6.0 on a 2Ghz pentium IVprocessor running Windows XP with 512 MB RAM.

6.1 Problem Domain

Given a set of cities and a cost associated to travelling from one city to theother, the TSP is to find a path such that each city is visited exactly onceand the total cost of travelling is minimised. In Asymmetric TSP (ATSP), thecost from a city i to another city j may not be the same as the cost from jto i. TSPTW is a time constrained variant of the TSP in which each city hasan associated visiting time and must be visited within a specific time window.TSPTW is a difficult problem which has applications in routing and scheduling.It is therefore extensively studied in the Operations Research and CP (see forinstance [4, 10]). In the rest, we consider the asymmetric TSPTW (ATSPTW).

The difficulty of ATSPTW stems from its optimisation and scheduling parts.ATSP is equivalent to finding a minimum-cost Hamiltonian path in a graph,which is NP-hard. Moreover, scheduling with release and due dates are difficultsatisfiability problems since they involve complex constraints at the same time.Solving an ATSPTW thus requires strong optimisation and feasibility reasoning.

Scheduling problems are one of the most established application areas of CP(see for instance [1]). Use of local search techniques greatly enhances the abilityof CP to tackle optimisation problems (see for instance [12] for a survey). Sincethe motivation of our work is to combine the power of constraint propagationfor solving feasibility problems with a local search method successful in tacklingoptimisation problems, ATSPTW is a good choice of problem domain to test thebenefits of local branching in CP. Note that we do not aim at competing withthe advanced techniques developed specifically to solve this problem. We wouldlike to show that local branching in CP can be good at solving problems withdifficult feasibility and optimisation aspects at the same time.

6.2 Local Branching on ATSPTW

To solve ATSPTW using local branching in CP, we have adopted the modelof [10] in which each Nexti variable gives the city to be visited after the cityi. Each city i is associated to an activity with the variable Starti indicatingthe time at which the service of i begins. Apart from the other constraints, themodel consists of the alldifferent(X) constraint (posted on the Next variables)and the cost function C, therefore it can be propagated using the cost-basedfiltering algorithm described in [9]. Due to the existence of the local branchingconstraints (again posted on the Next variables) in addition to alldifferent(X)and C, we can use the Assignment Problem as Rel and apply the cost-basedfiltering described in Section 4.

We test different variations of local branching in CP, each tuned for our ex-periments as we will describe later. We compare the results with a pure CP-basedsearch using the same model except that additive bounding does not exist andthus alldifferent(X) is propagated only by the cost-based filtering algorithm in[9]. The CP model is solved with depth-first search enhanced with constraintpropagation using the primitives available in Solver and Scheduler. For a faircomparison, we improve the search efficiency of CP by using a cost-based schedul-ing heuristic. At each node of the search tree, the activity associated to Nextiwith the smallest reduced-cost value cij in the domain is chosen and assigned j.In case of ties, the activity associated to the earliest start-time and the latestend-time is considered first. In the rest of this section, we refer to this heuristicas H1, the method using some form of local branching in CP as LB, and thepure CP-based search as CP .

The generic local branching framework is employed in the experiments asfollows. The initial feasible solution is found using CP which exploits H1. Eachsubtree is searched in increasing value of k with a heuristic exploiting the knowl-edge about the costs. While searching a subtree, we are essentially assigning

some variables their values in the last reference solution Xm and some the val-ues not in Xm. We refer to the search phase in which we change variables w.r.t.Xm as changing and the phase in which we assign the variables the correspond-ing values in Xm as fixing. We expect the changing phase to be more difficultthan the fixing phase, so we try and fix the less promising variables in the easierphase. Given the reference solution Next, we calculate δi for each Nexti whichis the minimum increase in the cost function by changing it from Nexti to avalue in D(Nexti)\{Nexti}. We then choose the variables with the highest δ inthe fixing phase and leave the others to the changing phase. Note that we havealso developed gerenal purpose heuristics which use only the gathered incumbentsolutions, but we do not consider them in this paper for reasons of space.

As we want to create subtrees big enough to improve solution quality butsmall enough to prove optimality fast, we set k to 3 or 5 (called LB3 and LB5

resp.). In the case of node 6, before switching to CP-based search in node 7, weapply conditional restarting in which a new feasible solution is found using H1.We also adopt a hybrid diversification method (called LBH). We search eachleft subtree with the local branching constraint ∆(X, X) ≤ 3. Before conditionalrestarting, we first enlarge the neighbourhood and search the additional subtreeby replacing ∆(X, X3) ≤ 3 with ∆(X, X3) = 4. If an improving solution X4 isfound, we continue local branching by reversing the last branching constraint to∆(X, X3) ≥ 5. Else if no improved solution is found, we again enlarge and searchthe new neighbourhood by replacing ∆(X, X3) = 4 with ∆(X, X3) = 5. Now, ifan improving solution is X4 found, we continue otherwise we apply conditionalrestarting, with the reverse branching constraint ∆(X, X3) ≥ 6 in node 7.

In all LB methods, we switch to CP-based search only if we again encounter anode like 6 after conditional restarting. In this case, we choose the next variableto assign based on the smallest-domain-first principle. The framework thereforespecialises for ATSPTW only in the phase of finding an initial feasible solution.

6.3 Experiments

We have used an heterogeneous set of 242 instances4 , with the number of nodes(i.e. cities) ranging from 17 to 233. Some instances differ only w.r.t. to thestrictness of time windows. As a consequence, the experiments are conductedon a wide range of instance structures. The results appear in Tables 1 and 2 inwhich the participating number of instances are indicated in paranthesis.

Tables 1(a) and 2(a) report results when we impose a 5 minute cutoff timeover all run-time. Since there is no feasible solution in the neighbourhood ∆ ≤ 2,we reduce the size of the neighbourhood in the large subtrees of LB5. Hence,whilst the left branching constraint in LB3 is kept as 1 ≤ ∆(X, X) ≤ 3 , it is3 ≤ ∆(X, X) ≤ 5 in LB5. In addition, we exploit “edge finding” [2] in LB5 inorder for a more effective exploration of the large neighbourhood.

In Table 1(a), we report statistics on the instances for which the methodsprovide solutions of different cost by the cutoff time. We show in the second

4 Available at http://elib.zib.de/pub/Packages/mp-testdata/tsp/atsptw/index.html.

(a) 5 minutes cutoff.

Method # times Average % gap(86) providing from the best solution

best solution CP LB3 LB5

The bestCP 15 0 5.51 7.24LB3 13 12.77 0 22.99LB5 28 25.65 4.72 0

BestCP 3 0 0 0.73LB3 30 15.97 0 0.07LB5 27 17.75 0 0

(b) 30 minutes cutoff.

Method # times Average % gap(54) providing from the best solution

best solution CP LB3 LB5 LBH

The bestCP 8 0 5.42 6.07 5.46LB3 2 11.67 0 0.52 0.52LB5 8 20.69 3.82 0 2.70LBH 7 26.28 3.77 30.01 0BestCP 0 N/A N/A N/A N/ALB3 20 9.98 0 4.41 0LB5 23 11.81 0.65 0 0LBH 29 11.21 0.51 3.04 0

Table 1. Instances where the methods provide different solution cost.

column how many times a method provides the best solution and in the thirdthe average gap of the provided solution w.r.t. the best one obtained from theother methods. The first group of rows are about the instances for which only 1method gives the best solution, whilst in the second multiple methods providethe best solution. We here are not interested in any run-time. All methods arerun until the cutoff time and thus we are interested in the best solution possible.The aim is to support that we can obtain better solutions within a time limit.

We observe that LB5 and LB3 are the winners in the groups, respectively.Moreover, for the instances where CP is the preferred method (the CP rows inboth groups), the average gap of the solutions provided by LB3 and LB5 aremuch smaller than the gap of the CP solutions when LB3 or LB5 is preferred.

In Table 2(a), we report statistics on the instances for which the methodsprovide equally good solutions by the cutoff time. We give statistics by dividingthe instances into three disjoint groups. In the first, none of the methods canprove optimality by the cutoff time, so we compare the average time to findthe best solution. In the second, all methods prove optimality, so we check theaverage total run-time required to prove optimality. The third group containsthe instances which are a mix w.r.t. completeness of the search. Since we cannotcompare run-times in this case, we look at the number of times each methodproves optimality. Such statistics help us support our theses that we can improvesolution quality quicker and we can prove optimality (quicker).

We observe that LB3 is the winner in all groups, with LB5 being preferrableover CP in two. Local branching thus provides clear benefits over CP . The re-sults in the tables support our theses. It is however difficult to decide betweenLB3 and LB5. In Tables 1(b) and 2(b), similar results are reported when we im-pose a 30 minute cutoff time instead. In addition, we compare with the previousmethods also the hybrid method LBH . Moreover, the neighbourhood defined by∆ ≤ 2 is skipped and edge finding propagation is incorporated in all methods.Even if more time is given to all methods to complete the search, the benefitsof local branching, in particular of LBH , are apparent. The results indicate thatLBH is the most robust method, combining the advantages of LB3 and LB5.

(a) 5 minutes cutoff.

Method Average comparison

All are cutoff (7) Best solution timeCP 179.02LB3 98.63LB5 103.16

None is cutoff (132) Total run-timeCP 38.04LB3 35.43LB5 60.50

Min. one & max. two # times provingare cutoff (13) optimalityCP 5LB3 11LB5 7

(b) 30 minutes cutoff.

Methods Average comparison

All are cutoff (2) Best solution timeCP 1184. 25LB3 611. 68LB5 490.02LBH 450. 88None is cutoff (175) Total run-timeCP 185.52LB3 113.39LB5 142.69LBH 114. 05Min. one & max. three # times provingare cutoff (6) optimalityCP 1LB3 5LB5 5LBH 4

Table 2. Instances where the methods provide the same solution cost.

7 Related Work and Conclusions

The local branching technique as introduced in [7] is a complete search methoddesigned for providing solutions of better and better quality in the early stagesof search by systematically defining and exploring large neighbourhoods. Onthe other hand, the idea has been used mainly in an incomplete manner since[7] (e.g. the current implementation in ILOG-Cplex “≥ 9.1”): linear constraintsdefining large neighbourhoods are iteratively added and the neighbourhoods areexplored, sometimes in a non-exhaustive way. When this is done within a localsearch method, the overall algorithm follows the spirit of both large neighbour-hood search [16] and variable neighbourhood search [14]. The main peculiarityof local branching is that the neighbourhoods and their exploration are generalpurpose. Related work includes the use of propagation within a large neigh-bourhood search algorithm [15], relaxations in propagation [18], local search tospeed up complete search [17], and Hamming distance to find similar and diversesolutions in CP [6]. The local branching framework constitutes of many basiccomponents that the list of related work is by no means complete.

In summary, we have shown that CP-based local branching raises a varietyof issues which do not concern only the implementation. The neighbourhoodsare not defined by linear inequalities and not explored by MIP techniques, butsurprisingly, both the definition and the exploration are as general as in the MIPcontext. One can also use a k-discrepancy constraint [11] to define the neigh-bourhood and a discrepancy-based technique (in the spirit of limited discrepancysearch [5]) for its effective exploration. In this way, both the modeling and thesearch of neighbourhoods would benefit from the important features of CP, asdone in the MIP counterpart. CP-based local branching, however, benefits fromdiverse areas: power of propagation and effective heuristics of CP, cost-based fil-tering and additive bounding borrowed from MIP (for proving optimality), andintensification and diversification borrowed from local search (for finding goodsolutions earlier). Our experiments show the benefits of local branching in CP.

Future research directions include the computation of a stronger and moresophisticated lower bound which takes into account more than one referencesolution and the analysis of problems involving different relaxations.

References

1. Baptiste P., Le Pape C., Nuijten W.: Constraint-based scheduling. Kluwer Aca-demic Publishers (2001)

2. Carlier, J., Pinson E.: An algorithm for solving job shop scheduling. ManagementScience 35 (1995) 164–176

3. Caseau, Y. , Laburthe, F.: Solving various weighted matching problems with con-straints. In: Proc. of CP-97. Volume of LNCS., Springer (1997) 17–31

4. Desrosiers J., Dumas Y., Solomon M.M., Soumis F.: Time constrained routing andscheduling. Network Routing (1995) 35–139

5. Harvey, W., Ginsberg, M.L.: Limited discrepancy search. In: Proc. of IJCAI-95,Morgan Kaufmann (1995) 607–615

6. Hebrard, E., Hnich, B., O’Sullivan, B., Walsh, T.: Finding diverse and similarsolutions in constraint programming. In: Proc. of AAAI-05, AAAI Press / TheMIT Press (2005) 372–377

7. Fischetti, M., Lodi, A.: Local branching. Mathematical Programming 98 (2003)23–47

8. Fischetti, M., Toth, P.: An additive bounding procedure for combinatorial opti-mization problems. Operations Research 37 (1989) 319–328

9. Focacci, F., Lodi, A., Milano, M.: Optimization-oriented global constraints. Con-straints 7 (2002) 351–365

10. Focacci, F., Lodi, A., Milano, M.: A hybrid exact algorithm for the TSPTW.Informs Journal on Computing 14(4) (2002) 403–417

11. Lodi, A., Milano, M., Rousseau, L.M.: Discrepancy based additive bounding forthe alldifferent constraint. In: Proc. of CP-03. Volume 2833 of LNCS., Springer(2003) 510–524

12. Milano, M.: Constraint and Integer Programming: Toward a Unified Methodology.Kluwer Academic Publishers (2003)

13. Milano, M., van Hoeve, W.: Reduced cost-based ranking for generating promisingsubproblems. In: Proc. CP-02. Volume 2470 of LNCS., Springer (2002) 1–16

14. Mladenovic, N., Hansen, P.: Variable neighbourhood search. Computers and Op-erations Research 24 (1997) 1097–1100

15. Perron, L., Shaw, P., Furnon, V.: Propagation guided large neighbourhood search.In: Proc. of CP-04. Volume 3258 of LNCS., Springer (2004) 468–481

16. Shaw, P.: Using constraint programming and local search methods to solve vehiclerouting problems. In: Proc. of CP-98. Volume 1520 of LNCS., Springer (1998)417–431

17. Sellmann, M., Ansotegui, C.: Disco - Novo - GoGo: integrating local search andcomplete search with restarts. In: Proc. of AAAI-06, AAAI Press (2006) 1051–1056

18. Sellmann, M., Fahle, T.: Constraint programming based lagrangian relaxationfor the automatic recording problem. Annals of Operations Research 118 (2003),17–33