Optmization models for communication networkshome.deib.polimi.it/capone/pioro/pioro2014.pdf · Optmization models for communication networks ... Mathematical Foundations for Signal
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
m L. Lasdon: Optimization Theory for Large Systems, McMillan, 1972
m M. Minoux: Mathematical Programming, Theory and Algorithms, J.Wiley, 1986
m L.A. Wosley: Integer Programming, J.Wiley, 1998
m M. Pióro and D. Medhi: Routing, Flow, and Capacity Design in Communication and Computer Networks, Morgan Kaufmann, 2004
m M. Pióro: Network Optimization Techniques, Chapter 18 in Mathematical Foundations for Signal Processing, Communications, and Networking, E. Serpedin, T. Chen, D. Rajan (eds.), CRC Press, 2012
CONTENTS I 1. Basics of optimization theory. Classification of optimization problems. Relaxation and duality. The role of convexity. 2. Multicommodity flow network (MCFN) problems – linear and mixe-integer problem formulations. Link-path vs. node-link formulations. Allocation vs. dimensioning problems. Various cases of routing, modular links. 3. Linear programming (LP). Basic notions and properties of LP problems. Simplex method – basic algorithm and its features. 4. Mixed-integer programming (MIP) and its relation to LP. Branch-and-bound (B&B) method and algorithms for problems involving binary variables. Extensions to the general MIP formulation. 5. Modeling non-linearities. Convex and concave objective functions and the crucial differences between the two. Step-wise link capacity/cost functions.
m set X ⊆ Rn is n bounded: contained in a ball B(0,r) = { x ∈ Rn : ⎪x⎪ ≤ r } n closed: for any { xn ∈ X, n=1,2,…}, lim xn ∈ X (if lim exists), X = closure(X) n compact: bounded and closed (every sequence contains a convergent subsequence) n open: x ∈ X ⇒ ∃ r > 0, B(x,r) ⊆ X (Rn \ X is closed)
m function f: X → R is continuous (X - closed) f(lim xn) = lim f(xn)
m extreme value theorem (Weierstrass) theorem: n assumptions: continuous function f: X → R, X compact n f achieves global maximum and global minimum on X.
m linear function: f(x) = a1x1 + a2x2 + … + anxn = ax n (n-1) dimensional hyperplane: ax = c n half-space: ax ≤ c
m simplex in R3: S = {(x1,x2,x3): x1 + x2 + x3 ≤ 1, x1 ≥ 0, x2 ≥ 0, x3 ≥ 0 } m polyhedron: x ∈ Rn: Ax ≤ b (m inequality constraints: A is an m by n matrix) m functions: quadratic, square root, linear, …
m set X ⊆ Rn is convex iff n for each pair of points x, y ∈ X, the segment [x,y] ⊆ X, i.e.,
{ (1-α)x + αy : 0 ≤ α ≤ 1 } ⊆ X n conv(X) – convex hull of (a non-convex) X: the smallest convex set including X n conv(X) – set of all convex combinations of the finite subsets of X
m function f: X → R is convex (for convex X) iff n for each x, y ∈ X and for each scalarα (0 ≤ α ≤ 1)
f((1-α)x + αy) ≤ (1-α)f(x) + αf(y) m strictly convex: if < for 0 < α < 1 m examples: f(x) = x2 on R, f(x) = max{ akx + bk } on R m convex ⇒ continuous
m function f: X → R is concave (for convex X) iff –f is convex
m optimization problem (OP): n minimize F(x) F: X → R objective function n x ∈ X X ⊆ Rn optimization space, feasible set n x = (x1,x2,…,xn) ∈ Rn variables
m convex problem (CXP): n X – convex set n F – convex function n effectively tractable
m linear programming (LP) problem n a very special convex problem (X – polyhedron, F – linear function) n efficient methods (simplex method)
m non-convex problems n (mixed) integer programming problems (MIP) (LP with discrete variables) n linear constraints and concave objective function (CVP)
m Lagrangean function (one vector of primal variables and two vectors of dual variables) n L(x; π,λ) = F(x) + ∑i λihi(x) + ∑j πjgj(x) n x ∈ X, λ - unconstrained in sign, π ≥ 0
m Dual function: n W(π,λ) = minx ∈ X L(x; π,λ) n λ - unconstrained in sign, π ≥ 0 n Dom(W) = {(π,λ): λ - unconstrained in sign, π ≥ 0, minx ∈ X L(x; π,λ) > - ∞} n note that when X is compact then minx ∈ X L(x; π,λ) > - ∞
m Dual problem (D): finding the best relaxation of (P) n maximize W(π,λ) n subject to (π,λ) ∈ Dom(W)
nodes (vertices): V = {v1,v2,v3,v4} links (edges/arcs): E = {e1,e2,e3,e4,e5} undirected/directed capacity: ce, cost: ξe demands: D = {d1,d2} undirected/directed end nodes, volume hd undirected paths for d1= {v1,v4} P11 = {e1,e4}, P12 = {e2,e5}, P13 = {e1,e3,e5}, P14 = {e2,e3,e4} directed paths for d1= {v1,v4} P11 = {e1,e5}, P12 = {e2,e6}, P13 = {e1,e3,e6}, P14 = {e2,e4,e5} link-path incidence δedp = 1 when link e belongs to path p of demand d node-link incidence ave = 1 when link e originates at node v
flow allocation problem (FAP) link-path formulation
m indices n d=1,2,…,D demands
n p=1,2,…,Pd paths for flows realizing demand d n e=1,2,…,E links
m constants n hd volume of demand d n ce capacity of link e n ξe unit flow cost on link e n δedp = 1 if e belongs to path p realizing demand d; 0, otherwise
the number of paths in the graph grows exponentially so we simply cannot put them all on the path lists!
5 by 5 Manhattan network: 840 shortest-hop paths between two
opposite corners
h = 1 + ε
c = 1 + ε c = 1 + ε
c = 1 + ε c = 1 + ε all 10 demands but one with h = 1 all 10 links but four with capacity 1 how should we know that the thick path must be used to get the optimal solution?
The number of shortest paths (each shortest path has 2(n-1) links) from s to t is equal to
(2n-2) over (n-1) in the sense of the Newton symbol. In the above example it is 4 over 2, i.e., 6. In general, when we have n x m nodes (n in the horizontal direction, and m in vertical), the formula reads
(n+m-2) over (m-1) which is equal to (m+n-2) over (n-1).
n v=1,2,... ,V nodes n e=1,2,...,E links (directed arcs)
m constants n hd volume of demand d n sd, td source, sink node of demand d n ave = 1 if arc e originates at node v; 0, otherwise n bve = 1 if arc e terminates in node v; 0, otherwise n ce capacity of arc e
n v=1,2,... ,V nodes n e=1,2,...,E links (undirected) n a=1,2,…,A arc (for bi-directed links) n eʹ′, eʺ″ two oppositely directed arcs of link e
m constants n hd volume of demand d n sd, td source, sink node of demand d n ava = 1 if arc a originates at node v; 0, otherwise n bva = 1 if arc a terminates in node v; 0, otherwise n ce capacity of link e
n v,t nodes n e arcs n v ≠ t demands (w.l.o.g. all demand pairs assumed)
m constants n hvt volume of demand from node v to node t n Ht = Σv ∈ V\{t} hvt total demand volume to node t n ave incidence coefficients for arcs originating at node v n bve incidence coefficients for arcs terminating at node v n ce capacity of arc e
#variables #constraints L-P P×V(V-1) = O(V2) V(V-1) + (k×V)/2 = O(V2) N-L (k×V × V(V-1))/2 = O(V3) V ×V(V-1) + (k×V)/2 = O(V3) A/N-L (k×V×V)/2 = O(V2) V×V + (k×V)/2 = O(V2) L-P advantages N-L advantages m more general than N-L no need to bother about paths
n hop-limit, flow restoration compact m path-flows directly calculated m more effective for known paths L-P disadvantages N-L disadvantages m initial paths sets less general m need for path generation need for finding optimal path flows m non-compact
m variables n ued ≥ 0 binary variable associated with flow of demand d on link e
m constraints = 1 if v = sd
n Σe aev ued - Σe bev ued = 0 if v ≠ sd,td
= -1 if v = td
v=1,2,...,V d=1,2,…,D
n Σd hdued ≤ ce e=1,2,…,E
now, hop limit can be introduced (how?) easy in the L-P formulation for the splittable case as well aggregated formulation cannot be (easily) adapted to this case
m for a fixed d minimize F = Σp (Σe ξeδedp)xdp = Σp κdpxdp = αdhd
n Σp xdp = hd d=1,2,…,D
m solution: F = Σd αdhd where
κdp - cost of path p of demand d αd – cost of the cheapest (shortest with respect to ξe) path of demand d let p(d) be such a path (Σe∈Pdp(d) ξe = αd)
m solution: put the whole demand on the shortest path: n x*dp(d) := hd ; x*dp := 0, p ≠ p(d) n we may also split hd arbitrarily over the shortest paths for d
m indices n j=1,2,...,n variables n i=1,2,...,m equality constraints n k=1,2,…,p inequality costraints
m constants n c = (c1,c2,...,cn) revenue (in minimization: cost) coefficients n a = (b1,b2,...,bm) right-hand-sides of the equality constraints n A = (aij) m × n matrix of equality constraint coefficients n e = (e1,e2,...,ep) right-hand-sides of inequality constraints n D = (dij) p × n matrix of inequality constraint coefficients
m variables n x = (x1, x2,...,xn)
m objective n maximize cx (or minimize)
m constraints n Ax = b, Dx ≤ e (any of these two suffice - why)
m a (convex) set of the form X = {x ∈ Rn : Ax ≤ b} is called a (convex) polyhedron
m a bounded polyhedron is called a polytope
m a vertex (extreme point) x ∈ X: x cannot be expressed as a convex combination of any finite set of other points y ∈ X (y ≠ x): x ≠ λ1y1 + λ2y2 + ... + λkyk for any k, y1, y2, ... , yk ∈ X and λ1 + λ2 + ... + λk = 1, λ1, λ2, ... , λk ≥ 0.
m every polytope X is a convex hull of its vertices
x1, x2, ... , xm, i.e., X is the set of all convex combinations of vertices: X = conv({x1,x2,...,xm }).
in general – also equalities
A set is convex, if it contains all convex combinations of its finite subsets.
P = conv( { y1,y2,…,yk } ) + cone( { z1,z2,…,zp } ) = { y+z: y ∈ conv(Y), z ∈ cone(Z) }
n convex hull of finitely many points plus a cone of finitely many points n convex hull of a finite set Y = set of all convex combinations of the elements
of Y (convex combination: Σy∈Y αyy, where Σy∈Y αy = 1, all αy ≥ 0) n cone of a finite set of points Z = set of all linear combinations of the
elements of Z with non-negative coefficients (Σz∈Z λzz, all λz ≥ 0)
m polytope P = bounded polyhedron; polytope = conv(Y) m vertices of polyhedron P: extreme points of conv(Y) m (extreme) rays of polyhedron P: extreme rays of cone(Z) m consider a polyhedron P and problem max { cx : x ∈ P }. Then the
problem is equivalent to: n max cx subject to Ax ≤ b n max cx subject to
m indices n j=1,2,...,n variables n i=1,2,...,m equality constraints
m constants n c = (c1,c2,...,cn) revenue (in minimization: cost) coefficients n b = (b1,b2,...,bm) right-hand-sides of the constraints n A = (aij) m × n matrix of constraint coefficients
m The maximum number of linearly independent rows of A (viewed as vectors ai ∈ Rn) equals the maximum number of linearly independent columns of A (viewed as vectors aj ∈ Rm).
m rank(A) = the maximum number of linearly independent rows of A = the maximum number of linearly independent columns of A.
m The following statements are equivalent: n { x ∈ Rn : Ax = b } ≠ ∅ n rank(A) = rank(A,b).
m A square n x n matrix A has rank(A) = n if, and only, if its rows (columns) are linearly independent.
m slack variables n Σj=1,2,...,n aijxj ≤ bi to Σj=1,2,...,n aijxj + xn+i = bi , xn+i ≥ 0 n Σj=1,2,...,n aijxj ≥ bi to Σj=1,2,...,n aijxj - xn+i = bi , xn+i ≥ 0 n remark: in exercises we will use si instead of xn+i
m nonnegative variables n xk with unconstrained sign: xk = xk
ʹ′ - xkʺ″ , xk
ʹ′ ≥ 0 , xkʺ″ ≥ 0
exercise: transform the following LP to the standard form m maximize z = x1 + x2 m subject to 2x1 + 3x2 ≤ 6
m feasible solution - satisfying constraints m basis matrix - a non-singular m × m submatrix of A m basic solution to a LP - the unique vector determined by a basis matrix: n-m
variables associated with columns of A not in the basis matrix are set to 0, and the remaining m variables result from the square system of equations
m basic feasible solution - basic solution with all variables nonnegative (at most m variables can be positive)
m Theorem 1
A vector x = (x1, x2,...,xn) is an extreme point of the constraint set if and only if x is a basic feasible solution.
m Theorem 2 The objective function, z, assumes its maximum at an extreme point of the constraint set.
standard form
To find the optimum: (efficient?) generate all basis matrices and find the best basic feasible solution.
B = [a(j1),a(j2), …,a(jm)] basis matrix (basis) xB = (xj1,xj2, … ,xjm) basic variables the rest non-basic variables equal to 0 by definition y = (y1,y2, … ,ym) By = b ⇒ y = B-1b xB = y (unique!) x = (0,…,0,xj1,0,0,…,0,xj2,0,0, … ,0,xjm,0,0,…,0) x – basic solution x – feasible basic solution when y ≥ 0
let rm+k = max {rm+j : j=1,2,…,n-m } > 0 let j be the index of the basic variable with minimum ej / djm+k (over djm+k > 0) xm+k enters the base and xj leaves the base we divide the j-th row by djm+k (to normalize the coefficient before xm+k) and use this row for eliminating xm+k from the rest of the rows
Integer Program (IP) maximize z = cx subject to Ax ≤ b, x ≥ 0 (linear constraints)
x integer (integrality constraint)
XIP – set of all feasible solutions of IP zIP – optimal objective PLP– polyhedron of the linear relaxation zLP – optimal objective of LP Fact 1: zIP ≤ zLP (zIP ≥ zLP for minimization) PIP = conv(XIP) – convex hull of XIP (the smallest polyhedron containing XIP) Fact 2: IP equivalent to the linear program max { cx : x ∈ PIP }
XMIP – set of all feasible solutions of MIP zMIP – optimal objective PLP– polyhedron of the linear relaxation zLP – optimal objective of LP Fact 1: zMIP ≤ zLP (zMIP ≥ zLP for minimization) PMIP = conv(XMIP) – convex hull of XMIP (the smallest polyhedron containing XMIP) Fact 2: MIP equivalent to the linear program max { cx + ey : (x,y) ∈ PMIP }
Mixed Integer Program (MIP)
maximize z = cx + ey subject to Ax + Dy ≤ b, x, y ≥ 0 (linear constraints)
❏ xi ∈ {0,1}, i=1,2,...,n m NU, N0, N1 ⊆ {1,2,...,n} partition of N = {1,2,...,n}
B&B subproblem: m P(NU,N0,N1) – relaxed problem in continuous variables xi, i ∈ NU
n minimize z = cx n Ax ≤ b n 0 ≤ xi ≤ 1, i ∈ NU
n xi = 0, i ∈ N0
n xi = 1, i ∈ N1
m zbest = +∞ upper bound (or the best known feasible solution of problem P) m convention: if P(NU,N0,N1) infeasible then z = +∞ (z = -∞ for maximization)
important: the sub-problem is a relaxation, that is, z* ≤ cx for x obtained by an arbitrary assignment of binary values to the variables in NU
solution(NU,N0,N1,x,z); { solve P(NU,N0,N1) } if NU = ∅ or for all i ∈ NU xi are binary then if z < zbest then begin zbest := z; xbest := x end else if z ≥ zbest then return { bounding } else begin { branching } choose i ∈ NU such that xi is fractional; BBB(NU \ { i },N0∪ { i },N1); BBB(NU \ { i },N0,N1∪ { i }) end
solution(NU,N0,N1,x,z); { solve P(NU,N0,N1) } if NU = ∅ or for all i ∈ NU xi are binary then if z > zbest then begin zbest := z; xbest := x end else if z ≤ zbest then return { bounding } else begin { branching } choose i ∈ NU such that xi is fractional; BBB(NU \ { i },N0,N1∪ { i }); BBB(NU \ { i },N0∪ { i },N1) end
procedure BBB begin zbest = -∞ ; solution(N,∅, ∅,x,z); put_list(N,∅, ∅,x,z); { solve(N,∅, ∅) and put active node on the list } while list not empty do begin take_list(NU,N0,N1,x,z); { take an active node from the list } if NU = ∅ or for all i ∈ NU xi are binary then
if z > zbest then begin zbest := z; xbest := x end else if z > zbest then { bounding if z ≤ zbest } begin { branching } choose(i); { choose i ∈ NU such that xi is fractional } solution(NU \ { i },N0,N1∪ { i },x,z); { solve P(NU \ { i },N0,N1∪ { i }) } put_list(NU \ { i },N0,N1∪ { i }, x,z); { put active node on the list } solution(NU \ { i },N0∪ { i },N1,x,z); { solve P(NU \ { i },N0∪ { i },N1) } put_list(NU \ { i },N0∪ { i },N1 , x,z); { put active node on the list } end
• we know that the optimal integer solution is not greater than 21.85 (21 in fact) • we will take a subproblem and branch on one of its variables - we choose an active subproblem (here: not chosen before)
- we choose a subproblem with highest solution value
m Solve the linear relaxation of the problem. If the solution is integer, then we are done. Otherwise create two new subproblems by branching on a fractional variable.
m A subproblem is not active when any of the following occurs: n you have already used the subproblem to branch on n all variables in the solution are integer n the subproblem is infeasible n you can fathom the subproblem by a bounding argument.
m Choose an active subproblem and branch on a fractional variable. Repeat until there are no active subproblems.
m Remarks
n If x is restricted to integer (but not necessarily to 0 or 1), then if x = 4.27 you would branch with the constraints x ≤ 4 and x ≥ 5.
n If some variables are not restricted to integer you do not branch on them.
• the order of visiting the nodes of the B&B tree • procedure take: take the first element from the list of active nodes • procedure put: defines the order
• best first: sort by the optimal values of the LR subproblem z • depth-first: put on top of the list (list = stack)
• choose(i) • choose the first fractional variable • choose the one closest to ½ (in the binary case)
solution(NU,N0,N1,x,z); { solve P(NU,N0,N1) } if NU = ∅ or for all i ∈ NU xi are binary then if z < zbest then begin zbest := z; xbest := x end else if z ≥ zbest then return { bounding } else begin { branching } choose i ∈ NU such that xi is fractional; BBB(NU \ { i },N0∪ { i },N1); BBB(NU \ { i },N0,N1∪ { i }) end
n ze ≥ ckye + bk e=1,2,…,E, k=1,2,…,K n Σp xdp = hd d=1,2,…,D n Σd Σp δedpxdp = ye e=1,2,…,E
m variables n xdp flow realizing demand d on path p n ye capacity of link e
m objective minimize Σe ξef(ye)
constraints n Σp xdp = hd d=1,2,…,D n Σd Σp δedpxdp = ye e=1,2,…,E n all variables are continuous and non-negative n f(y) = max{ cky + bk : k=1,2,...,K }
piece-wise approximation of a concave problem (MIP)
m minimize Σe ξe (∑k (ckyk + bekuek)) m constraints:
n ∑k yek = Ye e=1,2,…,E
n ∑k uek = 1 e=1,2,…,E
n 0 ≤ yek ≤ Δuek, uek ∈ {0,1} e=1,2,…,E, k=1,2,…,K n Σp xdp = hd d=1,2,…,D n Σd Σp δedpxdp = Ye e=1,2,…,E
m variables n xdp flow realizing demand d on path p n ye capacity of link e
m objective minimize Σe ξef(ye)
constraints n Σp xdp = hd d=1,2,…,D n Σd Σp δedpxdp = ye e=1,2,…,E n all variables are continuous and non-negative n f(y) = min{ cky + bk : k=1,2,...,K }
m In many problems most of potential variables are not used in the
primal problem formulation. m Dual constraints correspond to primal variables that are used. m It can happen that we are able to produce one (or more) new dual
constraints (corresponding to primal variables that are not considered in the problem) violated by current optimal dual solution u*.
m Then by adding these new constraints we are potentially able to decrease the optimal dual solution (since we are adding constraints to the dual problem).
m If we decrease the dual maximum then we decrease the primal minimum because W* = F*.
m Besides, if we are not able to eliminate current u*, then the current primal solution is optimal in the general sense (i.e., for the problem with all potential primal variables included).
m variables n xdp flow realizing demand d on path p n z auxiliary variable n recall: lists of admissible paths are given
m objective minimize z
m constraints n Σp xdp = hd d=1,2,…,D (λd - unconstrained) n Σd Σp δedpxdp ≤ ce+ z e=1,2,…,E (πe ≥ 0) n flow variables are continuous and non-negative, z is continuous
m if we can find a path shorter than λd* then we will get a more constrained dual problem and hence have a chance to improve (decrease) the optimal dual objective n i.e., to decrease the optimal primal objective
m shortest path algorithm can be used for finding shortest paths with respect to π*
m We can start with only one single path on the list for each demand (Pd = 1 for all d).
m We solve the dual problem for the given path-lists. Then for each demand d we find a shortest path with respect to weights π*, and if its length is shorter than λd* we add to the current path-list of demand d.
m If no path is added then we stop. If added, we come back to the previous step.
m This process will terminate typically (although not always) after a reasonable number of steps.
m Cycling may occur, so it is better not to remove paths that are not used.
m idea: more constraints but tighter m SPAR not applicable
m minimize F = Σe ξeye + Σe κeue
m subject to
n Σp xdp = 1 d=1,2,…,D n ye = Σd Σp δedpxdphd e=1,2,...,E n 0 ≤ ue ≤ 1 e=1,2,...,E n xdp ≤ ue d=1,2,...,D, p=1,2,...,Pd, e=1,2,…,E: δedp= 1 n x non-negative continuous, (ue binary/continuous)
n Note that XIP ⊆ P1 ⊂ P0 and we can iterate, generating a sequence of polyhedra P0 ⊃ P1 ⊃ ... ⊃ Pk ⊃ Pk+1 ⊃ ... ⊆ conv(XIP) ⊆ XIP such that zk+1 = cxk+1 ≤ zk = cxk where zk = cxk = max { cx : x ∈Pk }.
n We stop when xk ∈ Zn or zk = -∞ (i.e., when Pk = ∅).
maximize z = cx (zIP) subject to Ax = b, x ≥ 0 (linear constraints)
x integer (integrality constraint)
Assumption: A, b – integer Idea n solve the associated LP relaxation n find optimal basis n choose a basic variable that is not integer n generate a Chvátal-Gomory inequality using the constraint associated with this basic variable to cut-off the current relaxed solution
For the set X = { x ∈ Rn+ : Ax ≤ b } ∩ Zn ( A is m x n, aj is its j-th column)
and any u ∈ Rm+ the following are valid inequalities:
Σj=1,2,...,n uaj xj ≤ ub ( because u ≥ 0 ) Σj=1,2,...,n ⎣uaj⎦ xj ≤ ub ( because x ≥ 0 ) Σj=1,2,...,n ⎣uaj⎦ xj ≤ ⎣ub⎦ ( because the left hand side is integer )
(the first inequality is obtained by mutiplying i-th row of Ax ≤ b by ui and summing up all the rows)
Theorem: Every valid inequality for X can be obtained by applying the above Chvátal- Gomory procedure a finite number of times.
m branch-and-bound (B&B) n MIP can always be converted into binary MIP
transformation: xj = 20uj0 + 21uj1 + ... + 2qujq (xj ≤ 2q+1 -1) n Lagrangean relaxation can also be used for finding lower bounds
(instead of linear relaxation).
m branch-and-price (B&P) n solving LP subproblems at the B&B nodes by path generation
m branch-and-cut (B&C)
n combination of B&B with the cutting plane method n the most effective exact approach to NP-complete MIPs n idea: add cuts (ideally, defining facets of the integer polyhedron) n cut generation is problem dependent, and not based on general
„formulas” such as Gomory fractional cuts n cuts are generated at the B&B nodes
m V – set of nodes m δ(v) – set of links incident to node v ∈ V m δ+(v) – set of links outgoing from node v ∈ V m s(a), t(a) – source node and termination node of link a ∈ A m Pvw – power received by node w when v is transmitting m N – noise observed at a node m γ – signal to interference to noise ratio (SINR) threshold
m ∑a∈δ(v) Yat ≤ 1 v ∈ V, t ∈ T (single transceiver) m Xvt = ∑a∈δ+(v) Yat v ∈ V, t ∈ T (node active) m Ps(a)t(a)Yat ≥ γ (N + ∑v∈V\{s(a)} Pvt(a)Xvt) Yat a ∈ A, t ∈ T (SINR) m Ca = τB ∑t∈T Yat a ∈ A (link capacity)
m V – set nodes, V = G ∪ R n G – set of gateways n R – set of mesh routers
m g(r) – the gateway selected for mesh router r ∈ R m R(r) – fixed route (path) from g(r) ∈ G down to r, for each r ∈ R m n(a) – number of routes using link a ∈ A m f – total downloaded data volume for each mesh router (elastic traffic)
m maximize f n n(a)f ≤ Ca a ∈ A n plus the scheduling constraints
m C(j), i ∈ J – given list of compatible sets m ujt = 1 if C(j) is used in t ∈ T (binary variables) m f – total downloaded data volume (elastic traffic)
m IP: maximize f n ∑j∈J ujt = 1 t ∈ T n Ca= τB ∑j∈Jcaj ∑t∈Tujt a ∈ A n n(a)f ≤ Ca a ∈ A
problem formulation based on compatible sets
m LR: maximize f n ∑j∈J xj = Tτ n Ca= B ∑j∈J cajxj a ∈ A n n(a)f ≤ Ca a ∈ A n xj ≥ 0 j ∈ J
xj = τ ∑t∈T ujt
the relaxation tends to be excellent when τ is small and T large
m undirected graph (V,E) corresponding to the bidirected graph (V,A) m x = (xe, e ∈ E) – a given vector of binary edge variables m matching: a subset of edges S(x) = {e ∈ E: xe = 1 } when ⏐Σe∈δ(v)
xe ⏐≤ 1 for all v ∈ V m matching polytope (vertices = matchings)
n x(δ(v)) ≤ 1 v ∈ V n ⏐x(E[U])⏐≤ (⏐U⏐-1)/2 U ∈ O(V) n xe ≥ 0 e ∈ E
n where ❏ E[U] set of all edges with both ends in U ❏ O(V) family of all odd subsets of V ❏ x(W) = Σe∈W xe for W ⊆ E
n Σe∈δ(v) xe ≤ 1 v ∈ V n Σe∈E[U] xe ≤ (⏐U⏐-1)/2 U∈O(V) n Ya’(e) + Ya”(e) ≤ xe e ∈ E n Xv = ∑a∈δ+(v) Ya v ∈ V n Ps(a)t(a)Ya ≥ γNYa + γ∑v∈V\{s(a)} Pvt(a)Zav a ∈ A n Zav ≥ Ya+ Xv - 1 a ∈A, v ∈V n Zav ≤ Ya , Zav ≤ Xv a ∈A, v ∈V n Zav ≥ 0 n Ya ∈ {0,1} a ∈A n Xv ∈ {0,1} n xe ≥ 0 e ∈ E
m apply B&C to the pricing problem using the red cuts m apply C&B using the red cuts
begin choose an initial solution x∈ X; select an initial temperature T > 0; while stopping criterion not true count := 0; while count < L choose randomly a neighbour y ∈N(x); ΔF:= F(y) - F(x); if ΔF ≤ 0 then x := y else if random(0,1) < exp (-ΔF / T) then x := y; count := count + 1 end while; reduce temperature (T:= T×α) end while end
Simulated Annealing applied to the Travelling Salesman Problem (TSP)
Combinatorial Optimisation Problem given a finite set X (solution space) and evaluation function F : X → R find x ∈ X minimizing F(x), N(x) ⊆ X - neighbourhood of a feasible point (configuration) x ∈ X
TSP X = { x : set of all Hamiltonian cycles in a fully connected graph } N(x) - neighbourhood of x (next slide) Find a shortest Hamiltonian cycle (road distance)
m population = a set of N chromosomes n generation = a consecutive population
m chromosome = a sequence of genes n individual, solution (point of the solution space) n genes represent internal structure of a solution n fitness function = cost function
m Greedy Randomized Adaptive Search Procedure n start with some random solution n apply a greedy algorithm (e.g., a descent ascent LS) n when a local maximum is reached, apply simulated
Decision problems vs. computational and optimization problems
• decision problem –YES or NO (is there a knapsack solution ≥ R?) • computational problem – find a the optimal value of the objective function F(x) • optimization problem – find an optimal solution x* optimizing F(x) (i.e., find an optimal composition of a knapsack) Polynomial problems (P) (easy/quick to solve) Nondeterministic polynomial problems (NP) (quick to verify a solution) NP-complete problems (NP-C) (difficult) NP-hard problems (NP-H) (difficult)
Korte and Vygen: Combinatorial Optimization, Springer, 2012 (also: http://en.wikipedia.org/wiki/NP-complete)
How do find optimal solutions using decision problems?
• Optimization problem TSP: Find a shortest Hamiltonian cycle in a graph. All weights are
non-negative integers. • Decision problem H(MIN,MAX): Does there exist a Hamiltonian cycle of length between
MIN and MAX. • Computational problem F(MIN,MAX) : Compute the minimum length of the Hamiltonian
cycle, assuming it is between MIN and MAX
• Step 1 ( solve F(MIN,MAX) ): Binary search. Identify an initial interval [MIN,MAX] for the minimum cycle length (e.g. 0 and the sum of all weights) and the by halving this interval find the minimum lenght f* of a Hamiltonian cycle.
• Step 2 ( solve TSP ): Consider the edges one by one. Change the weight of the currently
considered edge e to W = (the sum of all weights + 1). Solve F(MIN,MAX+W). If the solution = f* then skip edge e by changing its weight to W permanently (we do not need it in the optimal solution). The final solution is composed of the edges with weights < W.
Decision problems • Examples of decision problems:
• is there a knapsack composition with the revenue greater equal R? • does a graph contain a Hamiltonian cycle (HC) • instances? YES-instances?
• polynomial problem P: there exists an algorithm that answers the YES-NO question for each instance of P in polynomial time (with respect to the size of the problem instance)
• nondeterministic polynomial problems P: for each YES-instance of P there exists at least one solution s (in the intuitive sense) that can be used to verify that the considered instance is a YES-instance in polynomial time (e.g., a Hamiltonian cycle)
• NP-complete problem P: each NP-problem can be transformed to P in polynomial time
Optimization problems • polynomial problem P: there exists an algorithm that finds a solution s for each
instance of P in polynomial time • NP-hard problem P: each problem in NP can be reduced to P in polynomial time
• A = {0,1,#} alphabet • An = {0,1,#}n the set of all strings over A of length n (A0 contains only empty string) • A* = ∪n An the set of all strings over A with a finite length • X ⊆ A* language (= problem = set of instances) • x∈X instance: a string representing binary numbers separated by # (e.g., 11#101#11#10#) • size(x) length of string x ∈X
• decision problem P: • P = (X,Y) where Y ⊆ X ⊆ A* (both X and Y are languages) • X is decidable (i.e., recognizable) in polynomial time
This means that there exists a polynomial p(n) = C⋅nm of a (fixed) degree m (C > 0) such that for each x ∈ A* of size(x) = n we can decide whether or not x ∈ X by executing a number of elementary steps not greater than p(n), i.e., there is a polynomial algorithm for that.
• X set of instances of problem P (coded effectively!) • Y set of YES-instances of problem P • X \ Y set of NO-instances of problem P
• Examples of P: • is there a knapsack composition with revenue greater equal R? • does a graph contain a Hamiltonian cycle (HC) • Instances? YES-instances? Is X decidable in polynomial time? And Y?
• A decision problem P=(X,Y) is polynomial if Y is decidable in polynomial time. • By definition, X is polynomially decidable. Hence, for a polynomial P, both
languages X and Y are decidable in polynomial time. • Clearly, then X \ Y is also decidable in polynomial time. • Simply put, there exists a polynomial algorithm for P.
• Polynomial transformation: • P1=(X1,Y1) polynomially transforms to P2=(X2,Y2) if there is a function f: X1→X2
computable in polynomial time such that f(x1)∈Y2 for all x1∈Y1, and f(x1)∈X2\Y2 for all x1∈ X1\Y1.
• In other words: YES-instances are transformed to YES-instances and NO-instances are transformed to NO-instances.
• Thus, deciding if x1∈Y1 can be done by deciding whether f(x1)∈Y2.
• NP problems: • a decision problem P = (X,Y) is nondeterministic polynomial if for each YES-
instance x ∈ Y at least one of its solutions s (YES-certificate, here solutions are understood intuitively) can be verified in polynomial time
• NP – because if we guessed a solution s of the instance x ∈ Y (at random), such that s shows that x is an YES-instance, then we would recognize x in polynomial time.
• examples • knapsack – it is easy to check if a given composition s is feasible and has revenue
greater equal R (here, all such solutions are YES-certificates – typical) • the same with HCP
• P ⊆ NP because x ∈ Y is its own YES-certificate. • in practice: for most problems in NP we can recognize every solution in polynomial
time (like in HCP). • remark: the definition of NP does not ask for a polynomial solution algorithm!
• Problem P is NP-complete (NP-C) if • P is in NP • every other problem in NP polynomially transforms to P.
• All NPs can be transformed in polynomial time to the Satisfiability Problem (SAT)
• Is P equal to NP? • Equivalently, are P and NP-C disjoint?
P
NP NP-C
There are thousands of known valid problems to which SAT can be polynomially transformed. So they all are NP-C since the polynomial transformation is transitive. Examples: decision versions of Travelling Salesman, Clique, Steiner Problem, Graph Colourability, Knapsack.
SAT U = {u1,u2,…un} - Boolean variables; t : U → {true,false} - truth assignment a clause - {u1,u2,u4 } represents disjunction of its elements (u1 ∨ ¬ u2 ∨ u4) a clause is satisfied by a truth assignment t if and only if one of its elements is true
under assignment t C - finite collection of N clauses (Boolean formulas) SAT: given: a set U of variables and a collection C of clauses
question: is there a truth assignment satisfying all clauses in C? (SAT is NP because a satisfying truth assignment serves as a certificate for any YES-instance.)
Cook’s theorem (1971): All problems in class NP polynomially transform to SAT,
Computational problem P: • A pair P = (X,f) where f : X → R, with X decidable in polynomial time. The problem
consists in computing f(x) for each instance x ∈ X • e.g., f(x) = the optimal value of the objective function of problem instance x ∈ X • example: given a weighted graph G, find the minimal length of a Hamiltonian cycle • example: given a graph G, find the minimum number of colors for coloring graph G.
• note that when f: X → {0,1} then P = (X, Y), where Y = { x ∈ X: f(x) = 1 }, is a decision problem (and vice versa)
• f(x) can be found through solving a series of decision problems for X using binary search for a • for minimization: YES-instance: is f(x) ≤ a?
• P1 polynomially reduces to P2 (both are computational problems) if there exists a polynomial-time oracle algorithm for P1 using a procedure for solving P2 where the time spent for solving P2 does not count.
• If P2 is polynomial then so is P1. • A computational problem P is NP-hard if all problems in NP polynomially reduce to P.
• remark: an NP-hard problem does not have to be an NP-problem • example: find the minimum number of colors for graph coloring.
m By showing that a selected problem Q from the list of proven NP-hard problems polynomially reduces to P n That is, if we could solve problem P effectively, then we would effectively solve
problem Q.
Example: n P is a „min-min” problem of finding two disjoint paths between s and t
such that the shorter of them is the shortest possible (the length of the other doesn’t matter).
n Q is a problem of finding two disjoint paths in a directed graph, one from s1 to t1, and one from s2 to t2 (all nodes s1, s2, t1, t2 are different).
X - set of vectors x = (x1,x2,...,xn) x ∈ X iff Ax ≤ b and x are integers Decision problem: Instance: given n, A, b, c, C. Question: is there x ∈ X such that cx ≤ C? The SAT problem is directly reducible to a binary IP problem. m assign binary variables xi and xi with to Boolean variables ui and ui
m an inequality for each clause of the instance of SAT (x1 + x2 + x4 ≥ 1) m add inequalities: xi + xi = 1, i=1,2,...,n m C = 1, cx = x1
• P = (X,Y) a decision problem, X must be decidable in polynomial time, Y ⊆ X. • P is polynomial (P) if Y (the set of YES-instances) is decidable in polynomial time as
well. • We do not know of any polynomial algorithm for thousands of decision problems. • P is nondeterministic polynomial (NP) if we can polynomially verify its YES-instances
using their (intuitively understood) solutions. • P is NP-complete if it is NP and every other NP problem polynomially transforms to P
(example: SAT). • Is P equal to NP? An open problem, but answer YES is most unlikely. • Hence, NP-complete problems are regarded as difficult. • Computational problem (compute f(x) for x∈X): can be solved using its decision version
called polynomial number of times. • Optimization problems: constructive problems. Sometimes solvable through decision
versions (e.g., TSP). • P is NP-hard if every NP problem polynomially reduces to P. • Having a problem P in hand and being unable to solve it effectively, try to prove its NP-
hardness. If it is NP-hard, don’t try to find a polynomial algorithm as you will be, most likely, waisting your time.
m The polyhedral separation problem for P: Given an arbitrary x0 ∈ Rn either (i) conclude that x0 ∈ P (ii) find a cut (f,f0) ∈ SP separating x0 and P, i.e., (f,f0) ∈ Rn+1 such that fx0 > f0 and ∀ x∈P, fx ≤ f0 .
m Theorem LOP max { cx: x ∈ P } is solvable in polynomial time for any c ∈ Qn
if, and only if, the polyhedral separation problem for P is solvable in polynomial time.
LOP max { cx: x ∈ P } is NP-hard for at least one c ∈ Qn
if, and only if, the polyhedral separation problem for P is NP-hard.
Still, even though generating cuts is in general difficult for difficult MIPs, there is a chance to effectively generate good cuts.
m let SP = set of all valid inequalities for P = { (f,f0) ∈ Rn+1 : P ⊆ { x ∈Rn : fx ≤ f0 } } m consider Pk (a polyhedron containing P) and a solution xk of LOP max { cx: x ∈ Pk }
such that xk not in P m How to find a cut separating P and xk ? m We try to find a most violated separator:
maximize fxk - f0 over all (f,f0) ∈ SP such that ⎥ f⎥ = 1
m in practice only LPs guarantee efficient solutions n decomposition methods are available for LPs (e.g., path generation) n convex problems can be approximated by LPs
m all real-life problems can be expressed as MIPs n concave objective functions can be approximated by MIPs
m MIPs and IPs can be solved by general solvers by branch-and-bound method (and its enhancements B&C, B&P, C&B, B&P&C), based on LP n CPLEX, XPRESS, Gurobi n sometimes efficiently
m otherwise, we have to use (frequently) unreliable stochastic meta-heuristics (sometimes specialized heuristics) n a good heuristic solution (i.e., a suboptimal MIP solution F*: initial zbest)
leads to improved branching (at a B&B node: F(x) ≥ F*)