Relaxations and Bounds: Applications to Knapsack Problemssofdem.github.io/teach/oro/m2oro-ilp-demassey-notes-lec7-8.pdf · 2.4 Notes on the 0-1 Knapsack Problem The 0-1 Knapsack Problem

Relaxations and Bounds:Applications to Knapsack Problems

M2 ORO: Advanced Integer ProgrammingLecture 7

[email protected]

October 24, 2011

1 Introduction

1.1 Why relaxation ?Relaxation is a key component for solving MILP. In a branch-and-bound method, it allows to reducethe size of the search tree by recognizing and pruning:

• infeasible nodes (if the relaxation is infeasible)

• solution nodes (if the solution of the relaxation is feasible for the problem)

• sub-optimal nodes (if the value of the relaxation is not better than the current incumbent)

A good relaxation:

• has a tight bound: the optimal value of the relaxation should be close to the optimal value of theproblem in order to prove more sub-optimality and to prune more nodes

• is solved quickly: the relaxation must be solved many times, i.e. at each node of the branch-and-bound. The running time of a B&B method is the sum of the running times (for solving therelaxation) at each node.

These two criteria are usually orthogonal: tighter relaxations are more constrained and, consequently,more difficult to solve. We have to choose the better compromise between the quality of the relaxationand its complexity.

1.2 Relaxation of combinatorial optimization problemsDefinition 1 Let (P) : z = max{f(x) | x ∈ S} a combinatorial optimization problem. A combinatorial relax-ation of (P) is any combinatorial optimization problem (P) : z = max{g(y) | y ∈ S} such that:

∀x ∈ S, ∃ y ∈ S, | g(y) � f(x).

The optimum of relaxation (P) is an upper bound of the optimum of (P) since z � z.

Example 1 The following cases define relaxations:

1. S ⊆ S and f ≡ g

2. S = {x ∈ Zn+ | Ax � b,Ex � d} and S = {x ∈ Zn

+ | Ax � b} and f ≡ g: the constraints Ex � d are saidto be relaxed (i.e. removed)

3. S = {x ∈ Zn+ | Ax � b,Ex � d} and S = {x ∈ Rn

+ | Ax � b,Ex � d} and f ≡ g: the integralityconstraints are relaxed; it is the LP relaxation or continuous relaxation (P) of the ILP (P).

EMN/INFO/SD/-/ORO2/AILP/lec7 page 1/11

Definition 2 Let (P) : z = max{f(x) | x ∈ S} a combinatorial optimization problem. A dual relaxation of (P)is any combinatorial optimization problem (D) : u = min{g(y) | y ∈ T } such that:

∀x ∈ S, ∀ y ∈ T , g(y) � f(x).

The optimum of dual relaxation (D) is an upper bound of the optimum of (P) since z � u. P and D form aweak-dual pair. Furthermore, if z = u, then P and D form a strong-dual pair.

Example 2 The following cases define dual relaxations:

1. (P) : z = max{cx | Ax � b, x ∈ Rn+} and (D) : u = min{by | yA � c,y ∈ Rm

+ } form a strong-dual pair,(D) is the LP dual of (P)

2. as a consequence, (P) : z = max{cx | Ax � b, x ∈ Zn+} and (D) : u = min{by | yA � c, x ∈ Zm

+ } form(at least) a weak-dual pair

1.3 Relaxation of integer linear programsAny problem formulated as an ILP has a natural relaxation: the LP relaxation. Generally, such an LPcan be solved efficiently by the simplex method. For some problems, specific algorithms exist whichare still more efficient. As an example, we will study in Section 2 an algorithm solving the continuous0-1 knapsack problem in linear time at each node of a search tree (and in quadratic time at the root ofthe tree).

The quality of the bound obtained by any LP relaxation depends on the strength of the formulation.Strong formulations and, a fortiori, ideal formulations (i.e. the extreme points of the polyhedron areinteger) are usually hard to find: as hard as to solve the ILP. Furthermore, they may be of exponentialsize. Hence, solving the LP relaxation can often be too time consuming within a branch-and-bound.

For hard ILP with weak formulations, relaxing complicating or coupling constraints other than in-tegrity constraints is often more efficient within a branch-and-bound: the relaxation may give betterbounds or it may be easier to solve. Furthermore, we can easily build relaxations, not just by removingconstraints, but by penalizing their violation. These kinds of relaxations will be illustrated in Section 3on the multi-knapsack problem.

2 LP Relaxation of the 0-1 Knapsack ProblemAn instance of the 0-1 Knapsack Problem is defined by a container of capacity c and a set I of items,|I| = n, with size (or weight) wi and value pi for item i ∈ I. All values are non-negative integers. Wewant to select a subset of items to place into the container such that the sum of the sizes does not exceedthe capacity of the container, and such that the sum of the values is maximized:

(P) : z = max�

i∈I

pixi, s.t.�

i∈I

wixi � c, xi ∈ {0, 1} (∀i ∈ I).

W.l.o.g, we assume that wi � c for all i ∈ I (any item can be selected) and that�

i∈Iwi > c (all theitems cannot be placed).

2.1 The 0-1 knapsack problem with unit sizes wi = 1If all items have the same size wi = 1 then any subset of items of cardinality no greater than c leadsto a feasible solution. We can easily compute an optimal solution of (P), by reordering the items indecreasing order of their value (p1 � p2 � · · ·pn), then selecting the c first items to pack into thecontainer:

xi =

�1 if i = 1, . . . , c0 otherwise.

In this algorithm, sorting can be executed once for all in O(n lnn) time, then selection runs in O(n).Note that this solution is optimum for (P) and for its LP relaxation (P), i.e.: z = z. Actually the formu-lation of (P) is ideal since the matrix of constraints is totally unimodular and the right hand side c isinteger.


2.2 The 0-1 knapsack problem with identical sizes wi = w

Now the feasible solutions are given by the set of items of cardinality no greater than c/w. Sorting theitems as previously then selecting the first �c/w� items1 leads to a feasible solution of (P):

xi =

�1 if i = 1, . . . , s− 10 otherwise.

where the threshold value s is defined by s = �c/w� + 1. As the right hand side c/w may not beinteger, the property of totally unimodularity does not apply anymore. Actually, the following solutionis optimum for the LP relaxation (P) (it will be proved in the general case by Theorem 2):

xi =

1 if i = 1, . . . , s− 1cw − � c

w� if i = s

0 otherwise.

This solution gives an upper bound for (P): z =�s−1

i=1 pi + ps(cw − � c

w�). As the optimal value of (P) isinteger, then a valid upper bound is:

�z� =s−1�

i=1

pi + � cwps�− � c

w�ps.

If cps/w is not integer then the optimal value of (P) is strictly greater than the optimum of (P):

z � �z� < z.

2.3 The 0-1 knapsack problem with arbitrary sizesThe principle to compute an optimal solution for the LP relaxation (P) is as before: sorting the items indecreasing order of pi/wi then by selecting the first items consecutively until the first item, the criticalitem s, is found which does not fit into the container. Then the container is filled up with a fraction ofthe critical item.

First, we assume that the ratio piwi

for all items i are all different, according to:

Lemma 1 Let two items i and j such that piwi

=pj

wjthen the LP relaxation (P) is equivalent to the LP relaxation

(P �) of a second knapsack problem obtained by merging the two items into one new item k of size wi +wj and ofvalue pi + pj.

Proof. Let x a solution of (P) and define

x�l =

�xl ∀l �= i, j, kwixi+wjxj

wi+wjif l = k.

x� is feasible for (P �) since x

�k =

wixi+wjxj

wi+wj∈ [0, 1] and wkx

�k = wixi +wjxj. Furthermore, its value is

the same as x since:

p�k =

(pi + pj)(wixi +wjxj)

wi +wj=

(piwi + pjwi)xi + (piwj + pjwj)xjwi +wj

= pixi + pjxj.

Conversely, to each solution x� for (P �) corresponds a feasible solution x of (P) with same cost, defined

by:

xl =

x�l ∀l �= i, j, k

max(1, (wi+wj)x�k

wj) if l = j

min(0, (wi+wj)x�k−wj

wi) if l = i.

1�a� denotes the greatest integer lower than a ∈ R.


Hence the two linear programs are equivalent.

Under the assumption that the items are sorted such that p1w1

>p2w2

> · · · > pnwn

then we can easilycompute an optimal solution for (P):

Theorem 2 (Dantzig, 1957).Let s = min{j |

�ji=1 wi > c}, and c = c−

�s−1i=1 wi then the following solution is optimum for (P):

xi =

1 if i = 1, . . . , s− 1cws

if i = s

0 otherwise.

Proof. x is clearly feasible for (P). Assume there exists x∗ an optimal solution such that x∗k < xk = 1

for some k < s, then there exists some q � s such that x∗q > xq (otherwise,�n

i=1 pix∗i <

�ni=1 pixi).

Let � = min(xk − x∗k, wq

wk(x∗q − xq)), and

xi =

x∗i + � if i = k

x∗i −

wkwq

� if i = q

x∗i otherwise.

xi is a feasible solution of (P) since x∗k � xk � xk, xq � xq � x

∗q and

�ni=1 wixi =

�ni=1 wix

∗i � c.

Furthermore, its cost is strictly greater than the cost of x∗i :�n

i=1 pi(xi − x∗i ) = (pk − pq

wkwq

)� > 0. Thisis absurd, so x

∗k = 1 for all k < s. We can prove in the same say that x∗k > 0 is impossible for any optimal

solution and for any k > s. Hence, by maximality of xs, x is the unique optimal solution of (P).

Again this solution gives an upper bound for (P) which, after rounding, is equal to:

�z� =s−1�

i=1

pi + � cwps�.

This LP bound is named the Dantzig bound for the 0-1 knapsack problem.If the items are already sorted as assumed, the computation of the Danzig bound clearly requires

O(n) time. (If this is not the case, the computation can still be performed in O(n) time by using aprocedure (Balas and Zemel, 1980) to determine the critical item.)

2.4 Notes on the 0-1 Knapsack ProblemThe 0-1 Knapsack Problem is NP-complete, but not in the strong sense since there exists a pseudo-polynomial time algorithm, based on dynamic programming, for solving this problem. The algorithmis based on the computation of the values fm(c �) = max{

�mi=1 pixi |

�mi=1 wixi � c

�, x ∈ {0, 1}m} ateach stage m increasing from 1 to n and for each capacity c

� increasing from 0 to c, according to thefollowing recursion (Bellman, 1954):

f1(c�) =

�0 ∀c � = 0, . . . ,w1 − 1p1 ∀c � = w1, . . . , c

and

fm(c �) =

�fm−1(c

�) ∀m = 2, . . . ,n, ∀c � = 0, . . . ,wm − 1max(fm−1(c

�), fm−1(c� −wm) + pm) ∀m = 2, . . . ,n, ∀c � = wm, . . . , c

A procedure was derived by (Toth, 1980) directly from this recursion, with a time and space complexityof O(nc). (This complexity can be reduced by eliminating the dominated states, i.e. a state (m �, c �) suchthere exists a dominating state (m ��, c ��) with fm �−1(c

�) � fm ��−1(c��) and c

��< c

�.)


3 Relaxations for the 0-1 Multi-Knapsack ProblemAn instance of the 0-1 Multi-Knapsack Problem (MKP) is defined by a set J of containers, |J| = m, withcapacity cj for container j ∈ J, and a set I of items, |I| = n, with size (or weight) wi and value pi foritem i ∈ I. All values are non-negative integers. We want to place a subset of items into the containerswithout exceeding the capacities of the containers, and such that the sum of the values is maximized:

(P) : z = max�

i∈I

�

j∈J

pixij, s.t.�

i∈I

wixij � cj(∀j ∈ J),�

j∈J

xij � 1(∀i ∈ I), x ∈ {0, 1}I×J.

W.l.o.g, we assume that wi � cj for all i ∈ I and that�

i∈Iwi > cj on each container j ∈ J.

3.1 LP RelaxationThe LP relaxation is defined by:

(P) : z = max�

i∈I

�

j∈J

pixij, s.t.�

i∈I

wixij � cj(∀j ∈ J),�

j∈J

xij � 1(∀i ∈ I), xij � 0.

(Note that the upper bounds on the variables xij � 1 are redundant with constraints�

j∈J xij � 1). Thedual (D) of (P) is:

(D) : u = min�

i∈I

µi +�

j∈J

cjπj s.t. wiπj + µi � pi(∀i ∈ I, j ∈ J), πj � 0(∀j ∈ J),µi � 0(∀i ∈ I).

The LP bound z can be computed in linear time according to the following proposition:

Proposition 3 Assume that the items are sorted in decreasing order of ratio pi/wi and let s be the critical items = min{k ∈ I |

�ki=1 wi >

�j∈J cj}, then:

z =s−1�

i=1

(pi −wi

wsps) +

�

j∈J

ps

wscj.

Proof. Consider the items in their order and place them consecutively on the first container until a firstitem s

1 is found wich does not fit: s1 = min{k ∈ I |�k

i=1 wi > c1}. Insert the maximum possible fraction(c1 −

�s1−1i=1 wi) of item s

1 in container 1. Continue by inserting the residual fraction (�s1

i=1 wi − c1) ofs

1 in container 2, and so on until the last container m is full. Let sm be the last item considered: i.e.the item placed partially into the last container m. By construction, we have

�sm

i=1 wi >�

j∈J cj and�sm−1

i=1 wi � �j∈J cj, then s

m = s. Hence the assignment x ∈ [0, 1]I×J associated to this solution isfeasible for (P) and it satisfies:

�

j∈J

xij =

1 ∀i = 1, . . . , s− 1(�

j∈J cj −�s−1

i=1 wi)/ws if i = s

0 ∀i = s+ 1, . . . ,n

Its cots is then equal to�

i∈I pi�

j∈J xij =�s−1

i=1 pi + ps/ws(�

j∈J cj −�s−1

i=1 wi). Conversely, it iseasy to see that the dual solution (π,µ) defined by:

πj = ps/ws ∀j ∈ J

µi = pi −wi

wsps ∀i = 1, . . . , s− 1

µi = 0 ∀i = s, . . . ,n

is feasible for (D) and has the same cost as x. Hence, by strong duality, the solutions x and (π,µ) areoptimum for (P) and (D) respectively.


3.2 Simple Combinatorial RelaxationBy relaxing the knapsack constraints, we obtain a BIP that is easy to solve:

(P0) : z0 = max�

i∈I

�

j∈J

pixij, s.t.�

j∈J

xij � 1(∀i ∈ I), xij ∈ {0, 1}.

Any assignment of items to containers leads to a feasible solution of (P0), and if all the items are assignedthen the solution is optimum. Hence, the optimum of (P0) gives a first simple upper bound for the MKP:

z0 =�

i∈I

pi.

It is easy to see that this bound is never better than the LP bound:

z � z � z0.

The knapsack constraints are said to be complicating constraints: relaxing them leads to an easyproblem. Unfortunately, the bound obtained by relaxing these complicating constraints is clearly farfrom the optimum in most of the cases. Instead of totally removing these constraints, we will see twodifferent ways to relax them by varying and controlling the degree of violation allowed.

3.3 Surrogate RelaxationThe principle of the surrogate relaxation is to replace a conjunction of constraints by one constraintobtained as a linear combination (weighted sum) of the constraints. In the context of the MKP, we canapply this principle to the conjunction of knapsack constraints. It means that we simply ensure thatthe global (weighted) sum of the sizes of the selected items does not exceed the (weighted) sum of thecapacities of the containers. The weight (or multiplier) associated to a knapsack constraint, i.e. to acontainer, indicates some kind of degree of violation allowed for that constraint: if the weight is 0, thenthe constraint is not considered at all.

Any linear combination of constraints, i.e. any vector of non-negative multipliers, leads to a relax-ation of the original problem. The surrogate relaxation is the better one (i.e. the one with the lowestoptimum) among all of these relaxations. We illustrate on the example of the MKP how to determinethe best relaxation without evaluating all the possible relaxations. Furthermore, we show that thisrelaxation can be reformulated as a 0-1 knapsack problem which can be approximated in linear time(according to the Dantzig bound) or exactly solved by dynamic programming in reasonable time whenthe values of the capacities are not to high.

Let π ∈ RJ+ a vector of multipliers, we define the following BIP:

(Pπ1 ) : z

π1 = max

�

i∈I

�

j∈J

pixij, s.t.�

j∈J

πj

�

i∈I

wixij ��

j∈J

πjcj,�

j∈J

xij � 1(∀i ∈ I), x ∈ {0, 1}I×J.

It is easy to see that any feasible solution of (P) is a feasible solution of (Pπ1 ), so (Pπ

1 ) is a relaxation of(P), and we have:

z � zπ1 � z0, ∀π ∈ RJ

+.

The surrogate relaxation is any relaxation (Pπ1 ), ∀π ∈ RJ

+, that minimizes zπ1 . We note z1 = min{zπ1 | π ∈RJ

+} the corresponding upper bound. Our goal is to determine a vector π ∈ RJ+ that minimizes z

π1

without evaluating this value (i.e. without solving (Pπ1 )) for all possible vectors.

First, we can eliminate the vectors that contains a coefficient 0:

Proposition 4 z1 = min{zπ1 | π ∈ RJ∗+}

Proof. We need to prove that zπ1 � zπ �1 for all vectors π and π

� such that π � contains a coefficient 0. Letπ� ∈ RJ

+ such that there exists j� ∈ J with πj � = 0. Assigning all items to container J leads to a feasible

solution of (Pπ �1 ):

xij =

�1 ∀i ∈ I, j = j

�

0 otherwise.


The cost of this solution is�

i∈I pi = z0 so the solution is optimum and zπ �1 is the trivial bound z0.

From now, consider only vectors π whose minimum coefficient πj � is strictly positive. The followingproposition shows that (Pπ

1 ) is then equivalent to a 0-1 knapsack problem where all the containers areaggregated as one container of capacity c

π = ��

j∈Jπjcjπj �

�.

Proposition 5 (Pπ1 ) is equivalent to:

(Kπ) : max�

i∈I

piyi, s.t.�

i∈I

wiyi � cπ, yi ∈ {0, 1}(∀i ∈ I).

Proof. To each feasible solution x of (Pπ1 ), we can associate a feasible solution y of (Kπ), selecting the

same subset I � ⊆ I of items (yi =�

j∈J xij, i.e. yi = 1 ⇐⇒ i ∈ I�), hence having the same cost�

i∈I � pi. Solution y is feasible since�

i∈Iwiyi is integer and:

�

i∈I

wiyi =�

i∈I

wi

�

j∈J

xij ��

i∈I

wi

�

j∈J

πj

πj �xij �

�

j∈J

πj

πj �cj.

Conversely, to each feasible solution y of (Kπ), we associate a feasible solution x of (Pπ1 ) of same cost,

by selecting the same subset I � ⊆ I of items, and by assigning them to the unique container j �:

xij =

�yi ∀i ∈ I, j = j

�

0 otherwise.

Solution x is feasible since�

j∈J

πj

�

i∈I

wixij = πj ��

i∈I

wiyi � πj ��

j∈J

πjcj

πj ��

�

j∈J

πjcj.

Hence, the surrogate bound z1 is the lowest optimal value of the knapsack problems (Kπ), withπ ∈ RJ

∗+. Actually, all these knapsack problems are defined on the same set of items (size and value)but with different container capacities. The best bound is then obtained by considering the knapsackthat is the more constrained (or the less relaxed), i.e. the knapsack with the lowest capacity c

π. Thelowest capacity

�j∈J cj is achieved when πj = πj � for all j ∈ J:

Proposition 6 z1 is the optimal value of the following knapsack problem:

(P1) : z1 = max{�

i∈I

piyi |�

i∈I

wiyi ��

j∈J

cj, y ∈ {0, 1}I }.

Proof. Consider the knapsack problem (P1) = (K1) corresponding to the vector 1 ≡ (1, . . . , 1) ∈ RJ∗+

then c1 =

�j∈J cj � c

π for all π ∈ RJ∗+. Consequently all feasible solution of (K1) is feasible in any

(Kπ) and then z11 � z

π1 for all π ∈ RJ

∗+.

3.4 LP Relaxation of the Surrogate RelaxationHence, the surrogate relaxation consists in agregating all the containers as one – of capacity

�j∈J cj –

and selecting the items to pack into this big container. This 0-1 knapsack problem (P1) may be solvedquite quickly using dynamic programming (or branch-and-bound). Its LP relaxation (P1) gives also anupper bound z1 of the MKP. Interestingly, this bound is equal to z.

Proposition 7z1 = z


Proof. It is clear that for any non-negative vector π, the LP-relaxation (Pπ1 ) of the surrogate relaxation

(Pπ1 ) coincides with the surrogate relaxation (Pπ

1 ) of the LP relaxation (P), which is itself a relaxation of(P). Hence:

z1 = minπ

zπ1 = minπ

zπ1 � z.

Conversely, an optimal solution y of (P1) can easily be determined by Dantzig theorem. Assume thatthe items are sorted in decreasing order of ratio pi/wi and let s be the critical item s = min{k ∈I |

�ki=1 wi >

�j∈J cj}, then we define yi = 1 for items i < s and ys = (

�j∈J cj −

�s−1i=1 wi)/ws.

The cost of this solution coincides with z according to Proposition 3.

3.5 Lagrangian RelaxationThe principle of lagrangian relaxation consists also in merging a conjunction of constraints in a MILP asa linear expression. There, contrary to the surrogate relaxation, the linear expression is not consideredas a constraint within the problem but as a violation penalty cost in the objective function.

3.5.1 Relaxing knapsack constraints.

We first apply this principle to the MKP by relaxing the knapsack constraints. Let π ∈ RJ+ a vector of

non-negative multipliers, we define the following BIP:

(Pπ2 ) : z

π2 = max

�

i∈I

�

j∈J

pixij −�

j∈J

πj(�

i∈I

wixij − cj), s.t.�

j∈J

xij � 1(∀i ∈ I), x ∈ {0, 1}I×J.

(Pπ2 ) is a relaxation of (P) since for any feasible solution x of (P), x is a feasible solution of (Pπ

2 ) and:�

i∈I

�

j∈J

pixij −�

j∈J

πj(�

i∈I

wixij − cj) ��

i∈I

�

j∈J

pixij.

(Pπ2 ) is a lagrangian subproblem of (P) and π is a price or lagrangian multiplier.

Again, the goal of lagrangian relaxation is to determine the best of these relaxations, i.e. to solve thefollowing problem:

(P2) : z2 = min{zπ2 | π ∈ RJ+}.

(P2) and (P) form a weak-dual pair since z(x) � zπ2 for all feasible solution x of (P) and for all feasible

solution π of (P2). Problem (P2) is actually called the lagrangian dual problem of (P).The difference z2 − z is called the dual gap. In some cases, the dual gap can be zero, then an optimal

solution of (P2) may leads to an optimal solution of (P):

Proposition 8 Let π ∈ RJ+ and x

π an optimal solution of (Pπ2 ),

if�

i∈I

wixπij � cj, ∀j ∈ J,

and�

i∈I

wixπij = cj, ∀j ∈ J | πj > 0,

then xπ is optimum for (P).

Proof. By assumption, xπ is a feasible solution of (P), and it is optimum since:

z � zπ2 =

�

i∈I

�

j∈J

pixπij −

�

j∈J

πj(�

i∈I

wixπij − cj) =

�

i∈I

�

j∈J

pixπij � z.

Note that the lagrangian relaxation of a given set of constraints is always dominated by the surrogaterelaxation of the same set of constraints. Indeed, each lagrangian suproblem (Pπ

2 ) is clearly a relaxationof the surrogate supbroblem (Pπ

1 ) associated to the same multiplier π ∈ RJ+: zπ1 � z

π2 . Hence, we have

z1 = minπ zπ1 � minπ z

π2 = z2.

In the context of MKP, we see that this lagrangian bound z2 is not even better than the LP bound z:


Proposition 9z2 = z

Proof. Observe that the formulation of (Pπ2 ) is ideal (the matrix is totally unimodular and the right

hand side is integer) then zπ2 = zπ2 � z, and then z2 � z. Conversely, it is clear that an optimal solution

of (Pπ2 ), with π ∈ RJ

∗+, can be obtained by determining a container j� with the smallest πj > 0, then

by placing in this container alone, all the items i such that pi − πj �wi > 0. Consider the problem withπj = ps/ws for all j ∈ J, where s is the critical item of Proposition 3, then the optimum of (Pπ

2 ) is zπ2 =�

i∈I

�j∈J(pi −

psws

wi)xij −�

j∈Jpsws

cj =�s−1

i=1 (pi −psws

wi) +�

j∈Jpsws

cj. Hence, z � z2 � zπ2 = z

according to Proposition 3.

Note that in this specific case, we were able to determine the optimal solution π ≡ ps/ws of the la-grangian dual problem. This is not always the case, as we will see in the next section. Incidently, thevector π ≡ ps/ws corresponds to the optimal dual variables associated to the knapsack constraints inthe LP relaxation (P) (see Proposition 3).

3.5.2 Relaxing assignment constraints.

The principle of lagrangian relaxation is useful in order to relax complicating constraints or couplingconstraints. A constraint is said coupling if the problem obtained by relaxing it can naturally be de-composed into a series of independant problems.

In the context of MKP, the assignment constraints are coupling since, when they are relaxed, theproblem can be decomposed into a series of m independant 0-1 knapsack problems.

We want to solve the lagrangian dual problem:

(P3) : z3 = min{zµ3 | µ ∈ RI+}

where the lagrangian subproblem for a multiplier µ ∈ RI+ is defined by:

(Pµ3 ) : z

µ3 = max

�

i∈I

�

j∈J

pixij −�

i∈I

µi(�

j∈J

xij − 1), s.t.�

i∈I

wixij � cj(∀j ∈ J), x ∈ {0, 1}I×J.

Note that in this relaxation, one exemplar of an item is allowed to be placed in each container. Actually(Pµ

3 ) can be decomposed as a series of m independant 0-1 knapsack problems, one for each containerj ∈ J:

(Kµj ) : k

µj = max

�

i∈I

(pi − µi)yi, s.t.�

i∈I

wiyi � cj, y ∈ {0, 1}I.

Proposition 10zµ3 =

�

j∈J

kµj +

�

i∈I

µi.

Proof. Let yji = xij for all item i ∈ I and for all container j ∈ J then x is a feasible solution of (Pµ3 ) if and

only if each yj is a feasible solution of (Kµ

j ), for all container j. Furthermore, the cost of x is the sum ofthe costs of yj plus

�i∈I µi:

zµ3 = max(

�

i∈I

�

j∈J

pixij −�

i∈I

µi(�

j∈J

xij − 1)) = max�

i∈I

�

j∈J

(pi − µi)xij +�

i∈I

µi

=�

j∈J

max(�

i∈I

(pi − µi)yji) +

�

i∈I

µi =�

j∈J

kµj +

�

i∈I

µi.

Note that all knapsack problems (Kµj ) are defined on the same set of items (with value pi − µi

and with weight wi) but on containers of different capacities. This can be exploited in order to solve


efficiently the series of knapsacks for a given µ. The bound zµ3 can be computed in pseudo-polynomial

time or approximated in polynomial time by the sum of the Dantzig bounds for the kµj .

Contrary to (P1) and (P2), it is not known how to determine analytically the optimum multiplier µfor (P3). An approximation of this optimum can be obtained through subgradient optimization or cutting-plane generation which are, however, generally time consuming. For saving computational time, onemay be interested by solving only one lagrangian subproblem (Pµ

3 ). The optimal dual variables µ asso-ciated with the assignment constraints in (P) can be a good choice for the multiplier (see Proposition 3):

µi =

�pi −

wiws

ps ∀i ∈ I, i < s

0 ∀i ∈ I, i � s.

Furthermore, considering this multiplier allows us to prove that the lagrangian bound z3 is alwaysas good as the LP bound z:

Proposition 11z3 � z

µ3 � z

µ3 = z.

Proof. Following the previous proposition, we have that zµ3 =�

j∈J kµj +

�i∈I µi where k

µj is the

optimum of (Kµj ) the LP relaxation of (Kµ

j ). Furthermore, for each knapsack problem (Kµj ), we have:

pi−µiwi

= psws

for all i < s and pi−µiwi

= piwi

� psws

for all i � s. It means that the items are ordered forapplying Dantzig theorem. Furthermore, as cj � �

j∈J cj, then only items i � s, i.e. items with thesame maximal ratio (pi − µi)/wi = ps/ws, can be placed on the container j. Let sj be the critical itemfor container j then

kµj =

sj−1�

i=1

(pi − µi) + (psj − µsj)cj −

�sj−1i=1 wi

wsj=

sj−1�

i=1

wips

ws+

ps

ws(cj −

sj−1�

i=1

wi) =ps

wscj.

Hence, zµ3 =�

j∈Jpsws

cj +�s−1

i=1 (pi −wiws

ps) = z from Proposition 3.

3.6 ConclusionThe upper bounds that we have considered for MKP can be sorted as follows:

z � z3 � zµ3 � z2 = z1 = z � z0

z � z1 � z2 = z1 = z � z0

No general dominance exists between the surrogate bound z1 and the lagrangian bound z3. Thesurrogate bound z1 can be computed in pseudo-polynomial time, by solving the 0-1 knapsack problemof Proposition 6, while the LP bound z can be computed in linear time, by solving the LP relaxationof this knaspack problem, i.e by computing formula of Proposition 3. The lagrangian bound z3 can becomputed or at least approximated by solving iteratively several instances of (Pµ

3 ). Each instance canbe solved in pseudo-polynomial time by solving a series of similar knapsack problems.

Note that it is easy to check whether a solution of a given instance (Pµ3 ) is feasible for (P): one has

to ensure that each item appears in at most one sub-solution of the knapsack problems (Kµj ), j ∈ J.

Furthermore, if the solution is optimum for (Pµ3 ) and if any item i such that µi > 0 is selected (in

exactly one knapsack) then the solution is optimum for (P). On the contrary, checking feasibility for thesurrogate relaxation (P1) is NP-hard, since it consists in checking whether the subset of selected items –which is globally compatible with the set of containers – can be decomposed into m subsets, compatibleeach with one of the containers. This problem is a generalization of the Bin-Packing Problem where bins(containers) are of different capacities.


3.7 ExampleConsider the MKP with 6 items of values p = (110, 150, 70, 80, 30, 5) and sizes w = (40, 60, 30, 40, 20, 5),and 2 containers of capacities c = (65, 85). First the items are already sorted in decreasing order of ratiopiwi

:114

>52>

73> 2 >

32> 1.

The critical item is s = 4 since 40 + 60 + 30 + 40 = 170 > 150 = 65 + 85 and the residual capacity isc = 150 − (40 + 60 + 30) = 20

The trivial bound is z0 = 110 + 150 + 70 + 80 + 30 + 5 = 445.For the LP bound, we need to compute the Dantzig bound of the 0-1 knapsack problem (P1).

(1, 1, 1, 12 , 0, 0) is optimum and its cost is z = 370.

For the surrogate bound, we need to solve at optimality the 0-1 knapsack problem (P1) with capacity150. (1, 1, 1, 0, 1, 0) is optimum and its cost is z1 = 110 + 150 + 70 + 30 = 360. In this specific case, wehave z1 < z.

For the lagrangian bound approximation zµ3 , we need to solve at optimality the 0-1 knapsack prob-

lems (Kµ1 ) and (Kµ

2 ) with µ = (110−40∗2, 150−60∗2, 70−30∗2, 0, 0, 0) = (30, 30, 10, 0, 0, 0). (0, 1, 0, 0, 0, 1)is optimum for (Kµ

1 ) and its cost is kµ1 = 125. (1, 0, 0, 1, 0, 1) is optimum for (Kµ2 ) and its cost is kµ2 = 165.

then zµ3 = 125 + 165 + 70 = 360. In this specific case, we have z1 = z

µ3 < z. Note that the optimum

solution of (Pµ3 ) is not feasible for (P) since item 6 is selected in both knapsacks.


Relaxations and Bounds (II):Lagrangian Relaxation

M2 ORO: Advanced Integer ProgrammingLecture 8

[email protected]

November 7, 2011

1 Lagrangian Relaxation

Consider the optimization problem (P) : z = max{cx | Ex � d, x ∈ X}, where X is a set of vectors in Rn+,

c ∈ Rn, E ∈ R|J|×n, d ∈ R|J|, and assume that constraints Ex � d are complicating, in the sense thatproblem (P) without these constraints is easy to solve.

Definition 1 For any non-negative vector µ ∈ RJ+, the lagrangian subproblem associated to the multipliers

µ by dualizing constraints Ex � d in (P) is defined by:

(Pµ) : zµ = max{cx− µ(Ex− d) | x ∈ X}.

The lagrangian dual of (P) relative to constraints Ex � d is defined by:

(L) : u = min{zµ | µ ∈ RJ+}.

u is called the lagrangian bound of (P).

Clearly, any lagrangian subproblem is a relaxation of (P) and the lagrangian dual problem is to deter-mine the best relaxation:

Proposition 1 z � u � zµ, ∀µ ∈ RJ+

Proof. Any feasible solution x of (P) is feasible for (Pµ) for any µ ∈ RJ+ and cx � cx− µ(Ex− d).

Conversely, a solution xµ of a lagrangian subproblem (Pµ) is almost feasible for (P). It satisfies all con-straints of (P) except, possibly, some constraints in Ex � d. Furthermore, any such violated constraint ispenalized by a negative cost −µ(Ejx

µ − d) within the objective function of (Pµ). Note however, that anoptimum solution of a lagrangian subproblem that is feasible for (P) is not necessary optimum for (P).Indeed, according to the following proposition, (µ, xµ) must also satisfy the complementary slacknesscondition µ(Ex− d) = 0:

Proposition 2 If xµ is an optimum solution of (Pµ) such that:

Ejxµ � dj ∀j ∈ J (1)

Ejxµ = dj ∀j ∈ J | µj > 0 (2)

then xµ is optimum for (P).

Proof. By assumption (1), xµ is feasible for (P) and, by assumption (2), it is optimum since z � zµ =cxµ − µ(Exµ − d) = cxµ � z.

Note that when the constraints to dualize are equality constraints Ex = d then the multiplier are notany longer restricted to be non-negative, (L) : u = min{zµ | µ ∈ RJ}. Furthermore, any optimum solutionof (Pµ) that is feasible for (P) is necessary optimum for (P), since z � zµ = cxµ −µ(Exµ −d) = cxµ � z.


1.1 Strength of the Lagrangian Bound

The following theorem interprets the dual relaxation (L), defined on the dual space µ ∈ RJ+, as a primal

relaxation, defined on the primal space x ∈ Rn+.

Theorem 3

u = max{cx | Ex � d, x ∈ conv(X)}

Proof. If X is empty then all problems (P) and (Pµ) are infeasible, then u = −∞. Otherwise, assumethat X contains a finite set of points X = {x1, . . . , xT }, then:

u = min{ maxt=1..T

(cxt−µ(Ext−d)) | µ ∈ RJ+} = min{y | y � cxt−µ(Ext−d)(∀t = 1, . . . , T), µ ∈ RJ

+,y ∈ R }

This is a LP then, by strong duality, we have:

u = max{T�

t=1

(cxt)λt |T�

t=1

λt = 1,T�

t=1

(Ejxt − dj)λt � 0(∀j ∈ J) , λ ∈ RT

+ }

As conv(X) = {�T

t=1 xtλt |

�Tt=1 λt = 1}, the result follows. It holds also when X is not a finite set.

Corollary 4 if (P) is an LP then (L) is equivalent to (P). Furthermore the dual optimal values µ∗associated to

constraints Ex � d are the optimum multiplier of (L).

Proof. (L) ≡ (P) is a direct consequence of the theorem: when X = {x ∈ Rn+ | Ax � b } then conv(X) =

X. It can also be proved by considering the LP duals (D) of (P) and (Dµ) of (Pµ). Actually, if (P) is anLP then any lagrangian subproblem is an LP (Pµ) : zµ = max{(c− µE)x+ µd | Ax � b, x ∈ Rn

+}, whoseLP dual is defined by (Dµ) : wµ = min{λb+ µd | λA � (c− µE), λ � 0}.As the LP dual of (P) is defined by:

(D) : w = min{λb+ µd | λA+ µE � c, λ,µ � 0} = min{wµ | µ � 0},

then, by strong duality, we have that

(L) ≡ minµ�0

(Pµ) ≡ minµ�0

(Dµ) ≡ (D) ≡ (P).

Furthermore, if (λ∗,µ∗) is optimum for (D) then µ∗ is optimum for (L).

Corollary 5 if (P) is an ILP then the lagrangian relaxation (L) is at least as good as the LP relaxation (P).

Proof. Let X = {x ∈ Zn+ | Ax � b } then

{x ∈ Zn+ | Ax � b,Ex � d } ⊆ conv{x ∈ Zn

+ | Ax � b,Ex � d }

⊆ conv(X)∩ {x ∈ Rn+ |Ex � d }

⊆ {x ∈ Rn+ | Ax � b,Ex � d }.

Hence, (P) is a relaxation of (L) which is a relaxation of (P): z � u � z.

In some cases, when formulation X is integral, the two relaxations are equivalent:

Corollary 6 If (P) is an ILP with X = {x ∈ Zn+ |Ax � b } and conv(X) = {x ∈ Rn

+ |Ax � b } then (L) ≡ (P).

Even in the case where (L) is not better than (P), this corollary is of practical interest since it providesan alternative way to the simplex method, to solve – or at least to approximate – (P). Consider forexample that conv(X) is defined by an exponential number of constraints so that the LP max{cx | Ex �d, x ∈ conv(X)} cannot be solved directly, but such that any lagrangian subproblem max{cx− µ(Ex−d) | x ∈ X} is a classical combinatorial problem for which there exists a polynomial time algorithm. Thenthe LP can be approximated by solving iteratively several lagrangian subproblems.


2 Solving the Lagrangian Dual Problem

The dual problem (L) is a min-max problem. The question is then:how to solve (L) without solving all lagrangian subproblems (Pµ) ?

When (P) is an integer linear program, then Theorem 3 indicates to ways to solve the lagrangian dualproblem (L): either by solving a linear program or by minimizing the convex fonction µ �→ zµ. In bothcases, the algorithms consist in solving iteratively subproblems (Pµ) for different values of multiplier µ,then recording the best (i.e. the lowest) optimum zµ

∗ found. They result only in an approximation (anupper bound) of the lagrangian dual optimum u, unless in the case where (P) ≡ (L), where we find anoptimum solution of subproblem (Pµ∗

) that is optimum for (P). The two agorithms differ by the waythe vector µ is updated at each iteration.

2.1 Cutting-plane generation

Theorem 3 shows that (L) can be formulated as a LP having potentially an exponential number ofconstraints:

u = min{y | y � cxt − µ(Ext − d)(∀t), µ ∈ RJ+,y ∈ R }

Solving this LP requires to use a cutting-plane generation algorithm: starting with an arbitrary multi-plier µ0 ∈ RJ

+, then at each iteration t � 1, a new constraint y � cxt − µ(Ext − d) is added to the LP,where xt is an optimum solution of the lagrangian subproblem (Pµt−1) such that (µt−1,yt−1) ∈ RJ

+×Ris the optimum solution of the LP computed at the previous iteration t− 1. The algorithm stops whenthe constraint to add is already satisfied by the current LP solution:

Proposition 7 At iteration t > 1, if yt−1 � cxt − µt−1(Ext − d) then u = yt−1 = zµt−1 .

Proof. Let (LPt−1) denotes the LP at iteration t− 1:

(LPt−1) : min{y | y � cxs − µ(Exs − d)(∀s = 1, . . . , t− 1), (µ,y) ∈ RJ+ × R }.

On one hand, as (µt−1,yt−1) is an optimum solution of (LPt−1), then there exists s ∈ {1, . . . , t− 1} suchthat yt−1 = cxs − µt−1(Ex

s − d). As xt is an optimum solution and xs is a feasible solution for (Pµt−1),then we have:

zµt−1 = cxt − µt−1(Ext − d) � cxs − µt−1(Ex

s − d) = yt−1.

By assumption, we conclude that yt−1 = zµt−1 . On the other hand, by definition of zµ, it is easy tosee that (µ, zµ) is a feasible solution of (LPt−1) for all µ ∈ RJ

+. Since (µt−1, zµt−1) is optimum (i.e.minimum) for (LPt−1), it then implies that zµt−1 � zµ for all µ ∈ RJ

+, and then we conclude thatu = zµt−1 .

This method is often time consuming because it may require an exponential number of iterations andthe LP becomes bigger and harder to solve at each iteration. Obviously, the algorithm can be stoppedat any moment, after a given ellapsed time or after a given number of iterations. As the value of the LPdecreases at each iteration, the last LP optimum computed gives an upper bound of u.

2.2 Subgradient algorithm

It is not hard to see that the function µ �→ zµ is piecewise linear convex but non-differentiable: actually,for a sufficiently small change of µ, say µ + �, the optimum solution of (Pµ) remains optimum for(Pµ+�) and the delta of the optimum values zµ+� − zµ is proportional to �.

The subgradient algorithm is a simple and easy to implement approach for minimizing such a func-tion. The algorithm starts with an arbitrary multiplier µ0 ∈ RJ

+ then, at each iteration t � 1, a newmultiplier is computed as µt = max(0,µt−1 + λt(Ext − d)) where xt is an optimum solution of thelagrangian subproblem (Pµt−1), and where (λt)t�0 are non-negative values called the step lenghts. Thebest bound zµt computed so far is returned at end, after a fixed number of iterations or when no im-provement is performed. The step lengths can be choosen according one of the three rules below:


Proposition 8 1. if limt→∞λt = 0 and�

t�0 λt → ∞ then limt→∞zµt = u;

2. if λt = λ0ρt

for some sufficiently large values of ρ < 1 and λ0 then limt→∞zµt = u;

3. if λt+1 = �t(zµt − u)/||Ext − d||2 where u is an upper bound of u and 0 < �t � 2 then either

limt→∞zµt = u or the algorithm finds some µt such that u � zµt � u.

The first rule is not of real practical interest as the convergence is ensured but slow (because the series�t�0 λt must be divergent). The two other rules lead to much faster convergence but have disadvan-

tages too. For the second rule, the geometric series�

t�0 λt may tend to zero too rapidly, and then thesequence (µt) may converge before reaching an optimal point. The difficulty when applying the lastrule is due to the requirement to know an upper bound u, but such a bound can be obtained by apply-ing a heuristic to (P). This rule is often applied in practice by setting �0 = 2 and halving �t wheneverzµt has failed to increase in some fixed number of iterations.

3 Embedding Lagrangian Relaxation within Branch-and-Bound

Solving the lagrangian dual problem (L) is generally not enough to solve the initial ILP (P), since, inmosts cases, there exists a duality gap u− z > 0. Furthermore, the iterative algorithms above terminategenerally before the optimum u is attained. Hence, a branch-and-bound approach is required in orderto solve (P) at optimality. The lagrangian relaxation can be used at each node of the search tree toprovide an upper bound. Furthermore, it is also possible to derive from the lagrangian relaxation botha heuristic feasible solution and fixed variables.

3.1 Lagrangian Heuristic

The best lagrangian solution xµ∗ computed is close to be feasible for (P). It is often easy to devise an

heuristic that converts xµ∗ into a feasible solution without greatly decreasing its cost. Such a problem

specific method might be called lagrangian heuristic. A well-known example of lagrangian heuristicapplies to the Generalized Assignment Problem, where n items must be packed into m containers:

(GAC) : z = max{m�

i=1

n�

j=1

cijxij |n�

j=1

xij � 1 (∀i = 1, . . . ,m),m�

i=1

aijxij � bij (∀j = 1, . . . ,n), x ∈ {0, 1}m×n}

By dualizing the first set of constraints, requiring that each item be used at most once, we get:

(GACµ) : zµ = max{m�

i=1

µi +m�

i=1

n�

j=1

(cij − µi)xij |m�

i=1

aijxij � bij (∀j = 1, . . . ,n), x ∈ {0, 1}m×n}

An optimum solution xµ of (GACµ) defines two types of items: the ones that are packed in at most onecontainer (

�nj=1 xij � 1) and the other ones (

�nj=1 xij > 1). xµ can easily be made feasible by removing

the items of the second type from all but one container each. The best solution consists to choose, foreach item i, a container j such that xij = 1 and that maximizes cij.

Consider the Set Covering Problem as a second example, where items j = 1, . . . ,n must be coveredeach by at least one selected set i = 1, . . . ,m:

(SCP) : z = min{m�

i=1

cixi |m�

i=1

aijxi � 1 (∀j = 1, . . . ,n), x ∈ {0, 1}m}

and let dualize all the cover constraints. An optimum solution xµ of lagrangian subproblem (SCPµ)is easy to find: xµi = 1 if ci −

�mj=1 µjaij � 0 and xµi = 0 otherwise. Then a feasible solution of

(SCP) can be computed by solving (heuristically) a much smaller instance of the set-covering problemobtained by removing from (SCP) all sets i selected in xµ (i.e. xµi = 1) and all items j covered by xµ (i.e.�m

i=1 aijxµi � 1). A feasible solution of (SCP) is given by xµ + x � where x � is a feasible solution of the

restricted set-covering problem.Note that if the lagrangian heuristic runs fast, then it can be applied at each iteration of the la-

grangian dual solution algorithm (e.g. subgradient method).


3.2 Variable fixing

Variable fixing is a very useful technique to accelerate a branch-and-bound by identifying the variablesthat can be set to their bounds. In this section, we consider that (P) is a Binary Integer Program, i.e.X ⊆ {0, 1}I. The technique tries to find variables which must necessary be set to 0 or to 1 in order toimprove the current incumbent.

Given an incumbent value, i.e. a lower bound z of z, then any better feasible solution x satisfies:�

i∈I

(ci − µEi)xi + µd = cx− µ(Ex+ d) � cx > z.

Let partition the variables as follows: I1 = {i ∈ I | ci − µEi > 0} and I0 = {i ∈ I | ci − µEi < 0}. If wedo not consider the constraints of feasibility, then the maximum value for

�i∈I(ci − µEi)xi + µd can

be attained by setting all variables xi = 1 if i ∈ I1 and xi = 0 if i ∈ I0.

Proposition 9 Let j ∈ I and x be a feasible solution of value cx strictly greater than z, then:

xj =

�0 if j ∈ I0 and

�i∈I1∪{j}(ci − µEi) + µd � z

1 if j ∈ I1 and�

i∈I1\{j}(ci − µEi) + µd � z

As lagrangian heuristics, variable fixing can also be applied at each iteration of the subgradient algo-rithm. Actually, it has more chance to be active in the last iterations, when zµ is close to z.

3.3 Branching Strategy

Within a branch-and-bound, it could be helpful to exploit the structure of the complicating constraintswithin the branching strategy, in such a way that each branching decision will “solve” a complicatingconstraint (the constraint will be entailed in the subtree). Furthermore, one may design the branchingstrategies in order that the lagrangian subproblems at each node will have the same structure and willnot be harder to solve than the lagrangian subproblems at the root node. Lastly, the last lagrangiansubproblem solved at a current node can be used in the child nodes: the current solution helps todetermine the next variable to branch on, while the current multiplier can be set as the initial multipliervalue for the computation of the lagrangian bound in the child nodes.

For example, in the case of the (GAC) problem, a natural branching strategy is to choose at each nodean item i and create n+ 1 branches: the n first branches correspond to a decision xij = 1 for j = 1, . . . ,n(item i is assigned once to container j) and the last branch corresponds to decision

�nj=1 xij = 0 (item

i is not selected). A second strategy (called GUB branching) consists to create only two branches byselecting a subset of containers J � � J and to post

�j∈J � xij = 0 in the first branch and

�j∈J\J � xij = 0

in the second branch. Clearly these two branching strategies satisfies the desired properties. For thechoice of item i to branch on, we might consider the lagrangian subproblem solution xµ

∗ at the currentnode in order to strenghten the bounds in the child nodes as much as possible, and choose item i thathas the largest penalty cost µ∗

i (�n

j=1 xµ∗

ij − 1) > 0: in any child node, as constraint�n

j=1 xij � 1 issatisfied, this penalty cost is eliminated when solving the lagrangian subproblem (GAC∗

µ) at the firstiteration of the subgradient algorithm.

References

• The Lagrangian Relaxation Method for Solving Integer Programming Problems, Marshall L. FIS-CHER, Management Science 27(1):1-18, 1982.

• Integer Programming, Laurence A. WOLSEY, Wiley 1998.

• Lagrangean relaxation, Monique GUIGNARD, TOP: An Official Journal of the Spanish Society of Statis-

tics and Operations Research 11(2):151-200, 2003.


Relaxations and Bounds: Applications to Knapsack Problemssofdem.github.io/teach/oro/m2oro-ilp-demassey-notes-lec7-8.pdf · 2.4 Notes on the 0-1 Knapsack Problem The 0-1 Knapsack Problem

Documents