Top Banner
Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna , Italy 2010
131

Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Master Class on Experimental Study of Algorithms

Scientific Use of ExperimentationCarla P. Gomes

Cornell UniversityCPAIOR

Bologna , Italy2010

Page 2: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Part I• Understanding computational complexity beyond worst-case

complexity– Benchmarks: The role of Random Distributions Random SAT– Typical Case Analysis vs. Worst Case Complexity analysis – phase transition

phenomena• Part II• Understanding runtime distributions of complete search methods

– Heavy and Fat-Tailed Phenomena in combinatorial search and Restart strategies• Understanding tractable sub-structure

– Backdoors and Tractable sub-structure– Formal Models of Heavy-tails and Backdoors – Performance of current state-of-the art solvers on real-world structured

problems exploiting backdoors

Big Picture of Topics Covered in this talk

Page 3: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

II - Understanding runtime distributions of complete search methods

Page 4: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Outline

• Complete randomized backtrack search methods

• Runtime distributions of complete randomized backtrack search methods

Page 5: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Complete Randomized Backtrack search methods

Page 6: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Exact / Complete Backtrack MethodsExact / Complete Backtrack Methods

Main Underlying (Search) Mechanisms in:Mathematical Programming (MP) Constraint Programming (CP)Satisfiability

Backtrack Search;

Branch & Bound; Branch & Cut; Branch & Price;Davis-Putnam-Logemann-Lovelan Proc.(DPLL)…

Page 7: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

1

2 3

x1 = 0 x1 = 1

44

44 44

4

x2 = 0

42 5

x2 = 1

6

x2 = 0

7

x2 = 1

44 44 44

8

x3 = 0

9

x3 = 1

10

x3 = 0

11

x3 = 1

12

x3 = 0

13

x3 = 1

43 43 43 43 44 -8 9 10 11

-

-

14 15

16 17

44

44

18 19 -38

maximize 16x1 + 22x2 + 12x3 + 8x4 +11x5 + 19x6subject to 5x1 + 7x2 + 4x3 + 3x4 +4x5 + 6x6 14

xj binary for j = 1 to 6

Page 8: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backtrack Search - Satisfiability

State-of-the-art complete solvers are based on backtrack search procedures(typically with unit-propagation, learning, randomization, restarts);

( a OR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

Page 9: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Randomization in Complete Randomization in Complete Backtrack Search MethodsBacktrack Search Methods

Page 10: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Motivation: Randomization in Local SearchMotivation: Randomization in Local Search

The use of randomization has been very successful in the area of local search or meta heuristics.

Simulated annealingGenetic algorithmsTabu SearchGsat, Walksat and variants.

Limitation: inherent incomplete nature of local search methods – cannot prove optimality or inconsistency.

Page 11: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Randomized Backtrack Search

Goal: Explore the addition of a stochastic element into a systematic search procedure withoutlosing completeness.

What if the we introduce an element of randomness

into a complete backtrack search method?

Page 12: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Several ways of introducing randomness into a backtrack search method:

simple way randomly breaking ties in variable and/or value selection.

general framework imposing a probability distribution for value/value selection or other search parameters;

Compare with standard lexicographic tie-breaking.

Note: with simple book-keeping we can maintain the completeness of the backtrack search method;

Randomized Backtrack Search

Page 13: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Notes on Randomizing Backtrack Search

Lots of opportunities to introduce randomization basically at different decisions points of backtrack search:

– Variable/value selection

– Look-ahead / look-back procedures

– E.g.:

• When and how to perform domain reduction/propagation

• What cuts to add;

– Target backtrack points

– RestartsNot necessarily tie breaking only more generally we can define a

probability distribution over the set of possible choices at a given decision point

Page 14: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Walsh 99

Notes on Randomizing Backtrack Search (cont).Notes on Randomizing Backtrack Search (cont).

• Can we replay a “randomized” run? yes since we use pseudo random numbers; if we save the “seed”, we can then repeat the run with the same seed;

• “Deterministic randomization” (Wolfram 2002) – the behavior of some very complex deterministic systems is so unpredictable that it actually appears to be random (e.g., adding nogoods or cutting constraints between restarts used in the satisfiability community)

• What if we cannot randomized the code?

Randomize the input – Randomly rename the variables

(Motwani and Raghavan 95)

(Walsh (99) applied this technique to studythe runtime distributions of graph-coloring using a deterministic algorithm based on

DSATUR implemented by Trick)

Page 15: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Runtime Distributions of Complete Randomized Backtrack search methods

Page 16: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backtrack Search Two Different Executions

( a OR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

Page 17: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Size of Search Trees in Backtrack SearchSize of Search Trees in Backtrack Search

• The size of the search tree varies dramatically , • depending on the order in which we pick the

variables to branch on Important to choose good heuristics for

variable/value selection;

Page 18: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Runtime distributions of Complete Randomized Backtrack search methods

When solving instances of a combinatorial problem

such as the Satisfiability problem or an Integer Program

using a complete randomized search method such as

backtrack search or branch and bound

- the run time of the randomized backtrack search method, running on single individual instances

(i.e.,several runs of the same complete randomized procedure on the same instance) exhibits very high variance.

Page 19: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Randomized Backtrack Search

(*) no solution found - reached cutoff: 2000

Time: (*)3011 (*)

Latin Square(Order 4)

Page 20: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Median = 1!

samplemean

3500!

Erratic Behavior of Sample MeanErratic Behavior of Sample Mean

500

2000

number of runs

Page 21: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Heavy-Tailed DistributionsHeavy-Tailed Distributions

… … infinite variance … infinite meaninfinite variance … infinite mean

Introduced by Pareto in the 1920’s

--- “probabilistic curiosity.”

Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.

Examples: stock-market, earth-quakes, weather,...

Page 22: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

The Pervasiveness of Heavy-Tailed Phenomena in Economics. Science, Engineering, and Computation

The Pervasiveness of Heavy-Tailed Phenomena in Economics. Science, Engineering, and Computation

Tsunami 2004

Blackout of August 15th 2003

> 50 Million People Affected

Financial Markets with huge crashes

… there are a few billionaires

Backtrack search

Annual meeting (2005).b

Page 23: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Standard Distribution(finite mean & variance)

Power Law Decay

Exponential Decay

Page 24: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Decay of Heavy-tailed DistributionsDecay of Heavy-tailed Distributions

Standard --- Exponential Decay

e.g. Normal:

Heavy-Tailed --- Power Law Decay

e.g. Pareto-Levy:

Pr[ ] , ,X x Ce x for some C x 2 0 1

Pr[ ] ,X x Cx x 0

Page 25: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Normal, Cauchy, and LevyNormal, Cauchy, and Levy

Normal - Exponential Decay

Cauchy -Power law DecayLevy -Power law Decay

Page 26: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Tail Probabilities (Standard Normal, Cauchy, Levy)

Tail Probabilities (Standard Normal, Cauchy, Levy)

c Normal Cauchy Levy

0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829

Page 27: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Fat tailed distributions

Kurtosis = 22

4

2

4

second central moment (i.e., variance)

fourth central moment

Normal distribution kurtosis is 3

Fat tailed distribution when kurtosis > 3(e.g., exponential, lognormal)

Page 28: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Fat and Heavy-tailed distributions

0,]Pr[ 2

CsomeforxCexX

Exponential decay for standard distributions, e.g. Normal, Logonormal,

exponential:

Heavy-Tailed Power Law Decay e.g. Pareto-Levy:

Pr[ ] ,X x Cx x 0

Normal

Page 29: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

How to Visually Check for Heavy-Tailed Behavior

Log-log plot of tail of distribution exhibits linear behavior.

Page 30: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

How to Check for “Heavy Tails”?How to Check for “Heavy Tails”?

Log-Log plot of tail of distribution

should be approximately linear.

Slope gives value of

infinite mean and infinite varianceinfinite mean and infinite variance

infinite varianceinfinite variance

1

21

Page 31: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Pareto =1Lognormal 1,1

X

f(x)

Infinite mean and infinite variance.

Lognormal(1,1)Pareto(1)

Page 32: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Survival Function:Pareto and Lognormal

Page 33: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Example of Heavy Tailed ModelExample of Heavy Tailed Model

Random Walk:

Start at position 0

Toss a fair coin:

with each head take a step up (+1)

with each tail take a step down (-1)

X --- number of steps the random walk takes to return to position 0.

Page 34: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

The record of 10,000 tosses of an ideal coin

(Feller)

Zero crossingLong periods without

zero crossing

Page 35: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Random Walk

Heavy-tails vs. Non-Heavy-TailsHeavy-tails vs. Non-Heavy-Tails

Normal(2,1000000)

Normal(2,1)

O,1%>200000

50%

2

Median=2

1-F

(x)

Uns

olve

d fr

acti

on

X - number of steps the walk takes to return to zero (log scale)

Page 36: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

466.0

319.0153.0

Number backtracks (log)

(1-F

(x))

(log

)U

nso

lved

fra

ctio

n

1 => Infinite mean

Heavy-Tailed Behavior in Heavy-Tailed Behavior in Quasigroup Completion Problem DomainQuasigroup Completion Problem Domain

18% unsolved

0.002% unsolved

Page 37: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

To Be or Not To Be

Heavy-Tailed

Gomes, Fernandez, Selman, Bessiere – CP 04

Page 38: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

466.0

319.0153.0

(1-F

(x))

(log

)U

nso

lved

fra

ctio

n

1 => Infinite mean

Heavy-Tailed Behavior in Heavy-Tailed Behavior in Quasigroup Completion Problem DomainQuasigroup Completion Problem Domain

18% unsolved

0.002% unsolved

Number backtracks (log)

Page 39: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Research Questions:

1. Can we provide a characterization of heavy-tailed behavior: when it occurs and it does not occur?

2. Can we identify different tail regimes across different constrainedness regions?

3. Can we get further insights into the tail regime by analyzing the concrete search trees produced by the backtrack search method?

Concrete CSP ModelsComplete Randomized Backtrack Search

Page 40: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Scope of Study

• Random Binary CSP Models• Encodings of CSP Models• Randomized Backtrack Search Algorithms• Search Trees• Statistical Tail Regimes Across Constrainedness

Regions– Empirical Results– Theoretical Model

Page 41: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Binary Constraint Networks

• A finite binary constraint network P = (X, D,C)

– a set of n variables X = {x1, x2, …, xn}– For each variable, set of finite domains

D = { D(x1), D(x2), …, D(xn)}– A set C of binary constraints between pairs of variables;

a constraint Cij, on the ordered set of variables (xi, xj) is a subset of the Cartesian product D(xi) x D(xj) that specifies the allowed combinations of values for the variables xi and xj.

– Solution to the constraint networkinstantiation of the variables such that all constraints are satisfied.

Page 42: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Random Binary CSP Models

Model B < N, D, c, t >

N – number of variables; D – size of the domains; c – number of constrained pairs of variables;

p1 – proportion of binary constraints included in network ;c = p1 N ( N-1)/ 2;

t – tightness of constraints;p2 - proportion of forbidden tuples; t = p2 D2

Model E <N, D, p>

N – number of variables; D – size of the domains: p – proportion of forbidden pairs (out of D2N ( N-1)/ 2)

(Achlioptas et al 2000)

(Gent et al 1996)

N – from 15 to 50; (Xu and Li 2000)

Page 43: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Typical Case Analysis: Beyond NP-Completeness

Constrainedness

Com

puta

tion

al C

ost (

Mea

n)

% o

f so

lvab

le in

stan

ces

Phase TransitionPhenomenon:Discriminating “easy” vs.“hard” instances

Hogg et al 96

Page 44: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Encodings

• Direct CSP Binary Encoding• Satisfiability Encoding (direct encoding)

Page 45: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backtrack Search Algorithms

• Look-ahead performed::– no look-ahead (simple backtracking BT);– removal of values directly inconsistent with the last instantiation

performed (forward-checking FC);– arc consistency and propagation (maintaining arc consistency, MAC).

• Different heuristics for variable selection (the next variable to instantiate):– Random (random);– variables pre-ordered by decreasing degree in the constraint graph (deg);– smallest domain first, ties broken by decreasing degree (dom+deg)

• Different heuristics for variable value selection:– Random– Lexicographic

• For the SAT encodings we used the simplified Davis-Putnam-Logemann-Loveland procedure: Variable/Value static and random

Page 46: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Inconsistent Subtrees

Page 47: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Distributions

• Runtime distributions of the backtrack search algorithms;

• Distribution of the depth of the inconsistency trees found during the search;

All runs were performed without censorship.

Page 48: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Main Results

1 - Runtime distributions2 – Inconsistent Sub-tree Depth

Distributions

Dramatically different statistical regimes across the constrainedness

regions of CSP models;

Page 49: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Runtime distributions

Page 50: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Distribution of Depth of Inconsistent Subtrees

Page 51: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Depth of Inconsistent Search Tree vs. Runtime Distributions

Page 52: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Other Models and More Sophisticated Consistency Techniques

Other Models and More Sophisticated Consistency Techniques

BT MAC

Heavy-tailed and non-heavy-tailed regions.As the “sophistication” of the algorithm increases the heavy-tailed region extends to the right, getting closer to the phase transition

Model B

Page 53: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

SAT encoding: DPLL

Page 54: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

To Be or Not To Be Heavy-tailed:Summary of Results

1 As constrainedness increases change from heavy-tailed to a non-heavy-tailed regime

Both models (B and E), CSP and SAT encodings, for the different backtrack

search strategies:

Page 55: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

2 Threshold from the heavy-tailed to non-heavy-tailed regime

– Dependent on the particular search procedure;

– As the efficiency of the search method increases, the extension of the heavy-tailed region increases: the heavy-tailed threshold gets closer to the phase transition.

To Be or Not To Be Heavy-tailed:Summary of Results

Page 56: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

3 Distribution of the depth of inconsistent search sub-trees

Exponentially distributed inconsistent sub-tree depth (ISTD) combined with exponential growth of the search space as the tree depth increases implies heavy-tailed runtime distributions.

As the ISTD distributions move away from the exponential distribution, the runtime distributions become non-heavy-tailed.

To Be or Not To Be Heavy-tailed:Summary of Results

Theoretical model fits data nicely!

Page 57: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Theoretical Model

Page 58: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Depth of Inconsistent Search Tree vs. Runtime Distributions

Theoretical Model

X – search cost (runtime);ISTD – depth of an inconsistent sub-tree;

Pistd [ISTD = N]– probability of finding an inconsistent sub-tree of depth N during search;

P[X>x | ISTD=N] – probability of the search cost being larger x, given an inconsistent tree of depth N

Page 59: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Depth of Inconsistent Search Tree vs. Runtime Distributions:Theoretical Model

See paper for proofdetails

Page 60: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Regressions for B1, B2, K

Regression for B1 and B2 Regression for k

Page 61: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Validation: Theoretical Model vs. Runtime Data

α= 0.26 using the model;α= 0.27 using runtime data;

Page 62: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Exploiting Heavy-Tailed Behavior:Restarts

Page 63: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Fat and Heavy Tailed behavior has been observed in several domains:

Quasigroup Completion Problems;

Graph Coloring;

Planning;

Scheduling;

Circuit synthesis;

Decoding, etc.

Page 64: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

How to avoid the long runs?

Use restarts or parallel / interleaved runs to exploit the extreme variance performance.

Restarts provably eliminate heavy-tailed behavior.

(Gomes et al. 97,98,2000)

Page 65: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

RestartsRestarts

70%unsolved

1-F

(x)

Un

solv

ed f

ract

ion

Number backtracks (log)

no restarts

restart every 4 backtracks

250 (62 restarts)

0.001%unsolved

Page 66: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Example of Rapid Restart Speedup(planning)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( b

ackt

rack

s )

20

2000 ~100 restarts

Cutoff (log)

Num

ber

back

trac

ks (

log)

~10 restarts

100000

Page 67: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Sketch of proof of elimination of heavy tailsSketch of proof of elimination of heavy tails

Let’s truncate the search procedure after m backtracks.

Probability of solving problem with truncated version:Run the truncated procedure and restart it repeatedly.

pm X m Pr[ ]

X numberof backtracks to solve the problem

Page 68: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

restartswithbacktracksnumbertotalY

F Y y pmY m

c e c y

Pr[ ] ( )

/1

12

Number of starts Y m Geometric pmRe / ~ ( )

Y - does not have Heavy Tails

Page 69: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Restart Strategies

• Restart with increasing cutoff - e.g., used by the Satisfiability community; cutoff increases linearly:

• Randomized backtracking – (Lynce et al 2001) randomizes the target decision points when backtracking (several variants)

• Random jumping (Zhang 2002) the solver randomly jumps to unexplored portions of the search space; jumping decisions are based on analyzing the ratio between the space searched vs. the remaining search space; solved several open problems in combinatorics;

• Geometric restarts – (Walsh 99) – cutoff is increased geometrically;

• Learning restart strategies – (Kautz et al 2001 and Ruan et. al 2002) – results on optimal policies for restarts under particular scenarios. Huge area for further research.

• Universal restart strategies (Luby et al 93) – seminal paper on optimal restart strategies for Las Vegas algorithms (theoretical paper)

Current state art sat solvers use restarts!!!

Page 70: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

III - Understanding Tractable Sub-Structure in Combinatorial Problems

Page 71: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors

Page 72: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Defying NP-Completeness

Current state of the art complete or exact solvers can handle very large problem instances of hard combinatorial :

We are dealing with formidable search spaces of exponential size --- to prove optimality we have to implicitly search the entire search ;

the problems we are able to solve are much larger than would predict given that such problems are in general NP complete or harder

Example – a random unsat 3-SAT formula in the phase transition region with over 1000

variables cannot be solved while real-world sat and unsat instances with over 100,000

variables are solved in a few minutes.

Page 73: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

A “real world” example

Page 74: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

i.e. ((not x1) or x7) and ((not x1) or x6)

and … etc.

Bounded Model Checking instance:Bounded Model Checking instance:

Page 75: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

(x177 or x169 or x161 or x153 … or x17 or x9 or x1 or (not x185))

clauses / constraints are getting more interesting…

10 pages later:

Page 76: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

4000 pages later:

!!!!!!a 59-cnf clause…

Page 77: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Finally, 15,000 pages later:

The Chaff SAT solver (Princeton) solves this instance in less than one minute.

Note that: … !!!

What makes this possible?

Page 78: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Inference and SearchInference and Search

–• Inference at each node of search tree:

– MIP uses LP relaxations and cutting planes;– CP and SAT - domain reduction constraint propagation and no-good learning.

• Search

Different search enhancements in terms of variable and value selection strategies, probing, randomization etc, while

guaranteeing the completeness of the search procedure.

Page 79: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Tractable Problem Sub-structure

Real World Problems are also characterized by

Hidden tractable substructure in real-world problems.

Can we make this more precise?

We consider particular structures we call backdoors.

Page 80: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors

Page 81: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

BACKDOORSSubset of “critical” variables such

that once assigned a value the instance simplifies to a tractable class.

Real World Problems are characterized by Hidden Tractable Substructure

Backdoors: intuitions

Explain how a solver can get “lucky” and solve very large instances

Page 82: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors to tractability

Informally: A backdoor to a given problem is a subset of its variables such that, once assigned values, the remaining instance simplifies to a tractable class (not necessarily syntactically defined).

Formally:

We define notion of a “sub-solver” (handles tractable substructure of problem instance)

Backdoors and strong backdoors

Page 83: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Defining a sub-solver Defining a sub-solver

Page 84: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Note on Definition of Sub-solver

•Definition is general enough to encompass any polynomial time propagation methods used by state of the art solvers:

–Unit propagation–Arc consistency–ALLDIFF–Linear programming–…–Any polynomial time solver

• Definition is also general to include even polytime solvers for which there does not exist a clean syntactical characterization of the tractable subclass.•Applies to CSP, SAT, MIP, etc

Page 85: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors (for satisfiable instances):

Strong backdoors (apply to satisfiable or inconsistent instances):

Defining backdoorsDefining backdoors

Given a combinatorial problem C

Page 86: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Example: Cycle-cutset

• Given an undirected graph, a cycle cutset is a subset of nodes in the graph whose removal results in a graph without cycles

• Once the cycle-cutset variables are instantiated, the remaining problem is a tree solvable in polynomial time using arc consistency;

• A constraint graph whose graph has a cycle-cutset of size c can be solved in time of O((n-c) k (c+2) )

• Important: verifying that a set of nodes is a cutset (or a b-cuteset) can be done in polynomial time (in number of nodes).

(Dechter 93)

Page 87: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

B

Cutset variable

Clique of size k cutset of size k-2;

Page 88: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors

•Can be viewed as a generalization of cutsets;

•Backdoors use a general notion of tractability based on a polytime sub-solver --- backdoors do not require a syntactic characterization of tractability.

•Backdoors factor in the semantics of the constraints wrt sub-solver and values of the variables;

•Backdoors apply to different representations, including different semantics for graphs, e.g., network flows --- CSP, SAT, MIP, etc;

(Dechter 93)Note: Cutsets and W-cutsets – tractability based solely on the structure of the constraint graph, independently of the semantics of the constraints;

Page 89: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors --- “seeing is believing”

Logistics_b.cnf planning formula. 843 vars, 7,301 clauses, approx min backdoor 16

Page 90: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Logistics.b.cnf after setting 5 backdoor vars (result after propagation; large cutsets);

Page 91: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

After setting just 12 (out of 800+) backdoor vars – problem almost solved.

Page 92: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Inductive inference problem --- ii16a1.cnf. 1650 vars, 19,368 clauses.Backdoor size 40.

Page 93: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

After setting 6 backdoor vars.

Page 94: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

After setting 38 (out of 1600+) backdoor vars:

Some other intermediate stages:

So: Real-world structurehidden in the network.Related to small-world

networks etc.

Page 95: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors: How the concept came about

Backdoors –

The notion came about from an abstract formal model built to explain the high variance in performance of current state-of-the-art solvers in particular heavy-tailed behavior and in our quest to understand the behavior of real solvers (propagation mechanisms, “sub-solvers” are key);

Emphasis not so much on proving that a set of variables is a backdoor (or that it's easy to find), but rather on the fact that if we have a (small) set of variables that is a backdoor set, then, once the variables are assigned a value, the polytime solver will solve the resulting formula it in polytime.

Surprisingly, real solvers are very good at finding small backdoors!

Page 96: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors: Quick detection of inconsistencies

• Detecting inconsistencies quickly --- in logical reasoning the ability to detect global inconsistency based on local information is very important, in particular in backtrack search (global solution);

• Tractable substructure helps in recognizing quickly global inconsistency --- backdoors exploit the existence of sub-structures that are sufficient to proof global inconsistency properties;

• How does this help in solving sat instances? By combining it with backtrack search, as we start setting variables the sub-solver quickly recognizes inconsistencies and backtracks.

Page 97: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Formal Models:On the connections between backdoors and

heavy-tailedness

Page 98: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Explain very long runs of complete solvers;

But also imply the existence of a wide range of solution times, often from very short runs to very long

How to explain short runs?

Fat and Heavy-tailed distributions

Backdoors

Page 99: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

T - the number of leaf nodes visited up to and including

the successful node; b - branching factor

0)1(][ iippibTP

Formal Model Yielding Heavy-Tailed Behavior

Formal Model Yielding Heavy-Tailed Behavior

b = 2 (Gomes 01; Chen, Gomes, and Selman 01)

Trade-off: exponential decay in making wrong branchingdecisions with exponential growth in cost of mistakes.

(inspired by work in information theory, Berlekamp et al. 1972)

1 backdoor

p –probability of not finding the backdoor

Page 100: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Expected Run Time(infinite expected time)

Variance

(infinite variance)

Tail

(heavy-tailed)

][1 TEb

p

][2

1 TVb

p

2][2

1 LCLTPb

p

p –probability of not finding the backdoor

Page 101: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

More than 1 backdoor

(Williams, Gomes, Selman 03)

Page 102: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors provide detailed formal model for heavy-tailed search behavior.

Can formally relate size of backdoor and strength of heuristics (capturedby its failure probability to identify backdoor variables) to occurrenceof heavy tails in backtrack search.

Page 103: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors in real-world problems instances

Page 104: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Backdoors can be surprisingly small:

Backdoors explain how a solver can get “lucky” on certain runs, when the backdoors are identified early on in the

search.

(large cutsets)

Page 105: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Synthetic Plannnig Domains

Synthetic domains, carefully crafted families of formulas:

•as simple as possible enabling a full rigorous analysis•rich enough to provide insights into real-world domains

Research questions – the relationship between problem structure, semantics of backdoors, backdoors size, and problem hardness.

Hoffmann, Gomes, Selman 2005

Page 106: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Synthetic Planning Domains

Three Synthetic Domains:

Structured Pigeon Hole (SPHnk); backdoor set O(n)

Synthetic Logistics Map Domain (MAPnk); backdoor set O(log n)

Synthetic Blocks World (SBWnk); backdoor set O(log n)

Each family is characterized by size (n) and a structure parameter (k);

Focus

DPLL – unit propagation;

Strong backdoors (for proving unsatisfiability)

Hoffmann, Gomes, Selman 2005

Page 107: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

L10

(...)L11 L2

1 Ln1

MAP813

L10

L21L1

1

L12

L113

L13

L16

L17

backdoor set O(log n)

backdoor set O(n2)

Cutset (n2)

Cutset (n2)

0lim AsymRation 1lim AsymRation

Number of Variables O(n2)

Number of Variables O(n2)

Note: the topology of the constraint graphs is identical for both cases. Size of cutset is of sameorder for both cases.

Page 108: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Semantics of Backdoors

• Consider G the set of goals in the planning problem; let’s define:

)(cos

)(cosmax

Gt

GtAsymRatio Gg

AsymRatio (0,1]Intuition – if there is a sub-goal that requires moreresources than the other sub-goals

main reason for unsatisfiability the larger the ratio the easier it is to detect inconsistency

Hoffmann, Gomes, Selman 2005

Page 109: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Asym Ratio – “Rovers” Domain (Simplified version of a NASA space application)

As asymRatio increases, the hardness decreases (Conjecture - Smaller backdoors)

Similar results for other domains: Depots, Driverlog, Freecell,Zenotravel

Page 110: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

MAP-6-7.cnf infeasible planning instances. Strong backdoor of size 3.392 vars, 2,578 clauses.

Page 111: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Map 5 Top: running without backdoor

Page 112: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Map 5 Top: running with “a” backdoor (size 9 – not minimum)

Page 113: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Initial GraphGraph after setting 1 backdoor variableGraph after setting 2 backdoor variables

In this graph one singlevariable is enoughto proof inconsistency(with unit propagation)

After setting three backdoor variables

Map 5 Top: running with minimum backdoor (size 3)

Page 114: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Map 5 Top: running with backdoor(minimum – size 3)

Initial Graph

After setting two backdoors After setting three backdoors

After setting one backdoor

Page 115: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Exploiting Backdoors

Williams, Gomes, Selman 03/04

Page 116: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Algorithms

We cover three kinds of strategies for dealing with backdoors:

A complete deterministic algorithmA complete randomized algorithm

Provably better performance over the deterministic one

A heuristicly guided complete randomized algorithmAssumes existence of a good heuristic for choosing

variables to branch onWe believe this is close to what happens in practice

Page 117: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Deterministic Generalized Iterative Deepening

Page 118: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Generalized Iterative Deepening

x1 = 0 x1 = 1

All possible trees of depth 1

x2 = 0 x2 = 1

(…)

xn = 0 xn = 1

Page 119: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Generalized Iterative DeepeningGeneralized Iterative Deepening Level 2

x1 = 0 x1 = 1

x2 = 0 x2 = 1 x2 = 0 x2 = 1

All possible trees of depth 2

Page 120: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Generalized Iterative DeepeningGeneralized Iterative Deepening Level 2

xn-1 = 0 Xn-1 = 1

xn = 0 xn= 1 xn = 0 xn = 1

Level 3, level 4, and so on …

All possible trees of depth 2

Page 121: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Randomized Generalized Iterative Deepening

Assumption:There exists a backdoor whose size is bounded by a function of n (call

it B(n))Idea:

Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor

Page 122: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Deterministic Versus Randomized

Deterministic strategy

Randomizedstrategy

Suppose variables have 2 possible values (e.g. SAT)

k

For B(n) = n/k, algorithm runtime is cn

c

Det. algorithm outperforms

brute-force search for k > 4.2

Page 123: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Complete Randomized Depth First Search with Heuristic

Assume we have the following.

DFS, a generic depth first search randomized backtrack search solver with:

• (polytime) sub-solver A• Heuristic H that (randomly) chooses variables to branch on, in polynomial time

H has probability 1/h of choosing a backdoor variable (h is a fixed constant)

Call this ensemble (DFS, H, A)

Page 124: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Polytime Restart Strategy for(DFS, H, A)

Essentially:

If there is a small backdoor, then (DFS, H, A) has a restart strategy that runs in polytime.

Page 125: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Runtime Table for Algorithms

DFS,H,A

B(n) = upper bound on the size of a backdoor, given n variables

When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and

deterministic algorithm

Page 126: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Exploiting Structure using Randomization:Summary

Over the past few years, randomization has become a powerful tool to boost performance of complete ( exact ) solvers;

Very exciting new research area with successful stories

E.g., state of the art complete Sat solvers use randomization.

Very effective when combined with no-good learning

Page 127: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

•Stochastic search methods (complete and incomplete) have been shown very effective.

•Restart strategies and portfolio approaches can lead to substantial improvements in the expected runtime and variance, especially in the presence of fat and heavy-tailed phenomena – a way of taking advantage of backdoors and tractable sub-structure.

• Randomization is therefore a tool to improve algorithmic performance and robustness.

Exploiting Randomization in Backtrack Search: Exploiting Randomization in Backtrack Search: SummarySummary

Page 128: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Summary

Research questions:

Should we consider dramatically different algorithm design strategies leading to highly asymmetric distributions, with a good chance of short runs (even if that means also a good chance of long runs), that can be effectively exploited with restarts?

Page 129: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Summary

Notion of a “backdoor” set of variables.Captures the combinatorics of a problem instance, as dealt with in practice. Provides insight into restart strategies.Backdoors can be surprisingly small in practice.

Search heuristics + randomization can be used to find them, provably efficiently.

Research IssuesUnderstanding the semantics of backdoors

Page 130: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Unlikely that we would have discover such phenomena by pure mathematical thinking / modeling.

Take home message:

In order to understand real-world constrained problems and scale up solutions the principled experimentation plays a role as important as formal models – the empirical study of phenomena is a sine qua non for the advancement of the field.

Scientific Use of Experimentation:

Take Home Message

Talk: described scientific experimentation applied to the study constrained problems has led us to the discovery of and understanding of interesting computational phenomena which in turn allowed us to better

algorithm design.

Page 131: Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

The End!

www.cs.cornell.edu/gomes