Phase transition behaviour Toby Walsh Dept of CS University of York.

Phase transition behaviour

Toby WalshDept of CSUniversity of York

2

Outline

What have phase transitions to do with computation?

How can you observe such behaviour in your favourite problem?

Is it confined to random and/or NP-complete problems?

Can we build better algorithms using knowledge about phase transition behaviour?

What open questions remain?

3

Health warning

To aid the clarity of my exposition, credit may not always be given where it is due

Many active researchers in this area:Achlioptas, Chayes, Dunne,

Gent, Gomes, Hogg, Hoos, Kautz, Mitchell, Prosser, Selman, Smith, Stergiou, Stutzle, … Walsh

Before we begin

A little history ...

5

Where did this all start?

At least as far back as 60s with Erdos & Renyi thresholds in random graphs

Late 80s pioneering work by Karp,

Purdom, Kirkpatrick, Huberman, Hogg …

Flood gates burst Cheeseman, Kanefsky &

Taylor’s IJCAI-91 paper

In 91, I has just finished my PhD and was looking for some new research topics!

Phase transitions

Enough of the history, what has this got to do with computation?

Ice melts. Steam condenses. Now that’s a proper phase transition ...

7

An example phase transition

Propositional satisfiability (SAT) does a truth assignment exist

that satisfies a propositional formula?

NP-complete

3-SAT formulae in clausal form with

3 literals per clause remains NP-complete

(x1 v x2) & (-x2 v x3 v -x4)

x1/ True, x2/ False, ...

8

Random 3-SAT

Random 3-SAT sample uniformly from space

of all possible 3-clauses n variables, l clauses

Which are the hard instances? around l/n = 4.3

What happens with larger problems?

Why are some dots red and others blue?

9

Random 3-SAT

Varying problem size, n

Complexity peak appears to be largely invariant of algorithm backtracking algorithms

like Davis-Putnam local search procedures

like GSAT

What’s so special about 4.3?

10

Random 3-SAT

Complexity peak coincides with solubility transition

l/n < 4.3 problems under-constrained and SAT

l/n > 4.3 problems over-constrained and UNSAT

l/n=4.3, problems on “knife-edge” between SAT and UNSAT

11

“But it doesn’t occur in X?”

X = some NP-complete problem

X = real problems

X = some other complexity class

Little evidence yet to support any of these claims!

12


X = some NP-complete problem

Phase transition behaviour seen in: TSP problem (decision not optimization) Hamiltonian circuits (but NOT a complexity peak) number partitioning graph colouring independent set ...

13


X = real problemsNo, you just need a suitable ensemble of problems to

sample from?

Phase transition behaviour seen in: job shop scheduling problems TSP instances from TSPLib exam timetables @ Edinburgh Boolean circuit synthesis Latin squares (alias sports scheduling) ...

14


X = some other complexity classIgnoring trivial cases (like O(1) algorithms)

Phase transition behaviour seen in: polynomial problems like arc-consistency PSPACE problems like QSAT and modal K ...

15


X = theorem proving

Consider k-colouring planar graphs

k=3, simple counter-example k=4, large proof k=5, simple proof (in fact, false proof of k=4 case)

Locating phase transitions

How do you identify phase transition behaviour in your favourite problem?

17

What’s your favourite problem?

Choose a problem e.g. number partitioningdividing a bag of numbers

into two so their sums are as balanced as possible

Construct an ensemble of problem instances n numbers, each uniformly

chosen from (0,l ]other distributions work

(Poisson, …)

18

Number partitioning

Identify a measure of constrainedness more numbers => less constrained larger numbers => more constrained could try some measures out at random (l/n,

log(l)/n, log(l)/sqrt(n), …)

Better still, use kappa! (approximate) theory about constrainedness based upon some simplifying assumptions

e.g. ignores structural features that cluster solutions together

19

Theory of constrainedness

Consider state space searched see 10-d hypercube

opposite of 2^10 truth assignments for 10 variable SAT problem

Compute expected number of solutions, <Sol> independence assumptions

often useful and harmless!

20

Theory of constrainedness

Constrainedness given by: kappa= 1 - log2(<Sol>)/n where n is dimension of state space

kappa lies in range [0,infty) kappa=0, <Sol>=2^n, under-constrained kappa=infty, <Sol>=0, over-constrained kappa=1, <Sol>=1, critically

constrained phase boundary

21

Phase boundary

Markov inequality prob(Sol) < <Sol>

Now, kappa > 1 implies <Sol> < 1 Hence, kappa > 1 implies prob(Sol) < 1

Phase boundary typically at values of kappa slightly smaller than kappa=1 skew in distribution of solutions (e.g. 3-SAT) non-independence

22

Examples of kappa

3-SAT kappa = l/5.2n phase boundary at kappa=0.82

3-COL kappa = e/2.7n phase boundary at kappa=0.84

number partitioning kappa = log2(l)/n phase boundary at kappa=0.96

23

Number partition phase transition

Prob(perfect partition) against kappa

24

Finite-size scaling

Simple “trick” from statistical physics around critical point, problems indistinguishable

except for change of scale given by simple power-law

Define rescaled parameter gamma = kappa-kappac . n^1/v kappac

estimate kappac and v empiricallye.g. for number partitioning, kappac=0.96, v=1

25

Rescaled phase transition

Prob(perfect partition) against gamma

26

Rescaled search cost

Optimization cost against gamma

27

Easy-Hard-Easy?

Search cost only easy-hard here? Optimization not decision search cost! Easy if (large number of) perfect partitions Otherwise little pruning (search scales as 2^0.85n)

Phase transition behaviour less well understood for optimization than for decision sometimes optimization = sequence of decision

problems (e.g branch & bound) BUT lots of subtle issues lurking?

Algorithms at the phase boundary

What do we understand about problem hardness at the phase boundary?How can this help build better algorithms?

29

Looking inside search

Three key insights constrainedness “knife-

edge” backbone structure 2+p-SAT

Suggests branching heuristics also insight into branching

mistakes

30

Inside SAT phase transition

Random 3-SAT, l/n =4.3

Davis Putnam algorithm tree search through space of

partial assignments unit propagation

Clause to variable ratio l/n drops as we search=> problems become less

constrained

Aside: can anyone explain simple scaling? l/n against depth/n

31

Inside SAT phase transition

But (average) clause length, k also drops=> problems become more

constrained

Which factor, l/n or k wins? Look at kappa which

includes both!

Aside: why is there again such simple scaling?

Clause length, k against depth/n

32

Constrainedness knife-edge

kappa against depth/n

33

Constrainedness knife-edge

Seen in other problem domains number partitioning, …

Seen on “real” problems exam timetabling (alias graph colouring)

Suggests branching heuristic “get off the knife-edge as quickly as possible” minimize or maximize-kappa heuristicsmust take into account branching rate, max-kappa

often therefore not a good move!

34

Minimize constrainedness

Many existing heuristics minimize-kappa or proxies for it

For instance Karmarkar-Karp heuristic for number partitioning Brelaz heuristic for graph colouring Fail-first heuristic for constraint satisfaction …

Can be used to design new heuristics removing some of the “black art”

35

Backbone

Variables which take fixed values in all solutions alias unit prime implicates

Let fk be fraction of variables in backbone l/n < 4.3, fk vanishing

(otherwise adding clause could make problem unsat)

l/n > 4.3, fk > 0discontinuity at phase

boundary!

36

Backbone

Search cost correlated with backbone size if fk non-zero, then can easily assign variable “wrong”

value such mistakes costly if at top of search tree

Backbones seen in other problems graph colouring TSP …

Can we make algorithms that identify and exploit the backbone structure of a problem?

37

2+p-SAT

Morph between 2-SAT and 3-SAT fraction p of 3-clauses fraction (1-p) of 2-clauses

2-SAT is polynomial (linear) phase boundary at l/n =1 but no backbone discontinuity

here!

2+p-SAT maps from P to NP p>0, 2+p-SAT is NP-complete

38

2+p-SAT

fk only becomes discontinuous above p=0.4 but NP-complete for p>0 !

search cost shifts from linear to exponential at p=0.4

recent work on backbone fragility

Search cost against n

Structure

Can we model structural features not found in uniform random problems?How does such structure affect our algorithms and phase transition behaviour?

40

The real world isn’t random?

Very true!Can we identify structural

features common in real world problems?

Consider graphs met in real world situations social networks electricity grids neural networks ...

41

Real versus Random

Real graphs tend to be sparse dense random graphs contains

lots of (rare?) structure

Real graphs tend to have short path lengths as do random graphs

Real graphs tend to be clustered unlike sparse random graphs

L, average path lengthC, clustering coefficient(fraction of neighbours connected to

each other, cliqueness measure)

mu, proximity ratio is C/L normalized by that of random graph of same size and density

42

Small world graphs

Sparse, clustered, short path lengths

Six degrees of separation Stanley Milgram’s famous

1967 postal experiment recently revived by Watts &

Strogatz shown applies to:

actors databaseUS electricity gridneural net of a worm...

43

An example

1994 exam timetable at Edinburgh University 59 nodes, 594 edges so

relatively sparse but contains 10-clique

less than 10^-10 chance in a random graph assuming same size and

density

clique totally dominated cost to solve problem

44

Small world graphs

To construct an ensemble of small world graphs morph between regular graph (like ring lattice) and

random graph prob p include edge from ring lattice, 1-p from

random graph

real problems often contain similar structure and stochastic components?

45

Small world graphs

ring lattice is clustered but has long paths random edges provide shortcuts without

destroying clustering

46

Small world graphs

47

Small world graphs

48

Colouring small world graphs

49

Small world graphs

Other bad news disease spreads more

rapidly in a small world

Good news cooperation breaks out

quicker in iterated Prisoner’s dilemma

50

Other structural features

It’s not just small world graphs that have been studied

Large degree graphs Barbasi et al’s power-law model

Ultrametric graphs Hogg’s tree based model

Numbers following Benford’s Law 1 is much more common than 9 as a leading digit!

prob(leading digit=i) = log(1+1/i) such clustering, makes number partitioning much easier

The future?

What open questions remain?Where to next?

52

Open questions

Prove random 3-SAT occurs at l/n = 4.3 random 2-SAT proved to be at l/n = 1 random 3-SAT transition proved to be in range

3.003 < l/n < 4.506 random 3-SAT phase transition proved to be

“sharp”

2+p-SAT heuristic argument based on replica symmetry

predicts discontinuity at p=0.4 prove it exactly!

53

Open questions

Impact of structure on phase transition behaviour some initial work on quasigroups (alias Latin

squares/sports tournaments) morphing useful tool (e.g. small worlds, 2-d to 3-d

TSP, …)

Optimization v decision some initial work by Slaney & Thiebaux problems in which optimized quantity appears in

control parameter and those in which it does not

54

Open questions

Does phase transition behaviour give insights to help answer P=NP? it certainly identifies hard problems! problems like 2+p-SAT and ideas like backbone also

show promise

But problems away from phase boundary can be hard to solve

over-constrained 3-SAT region has exponential resolution proofs

under-constrained 3-SAT region can throw up occasional hard problems (early mistakes?)

Summary

That’s nearly all from me!

56

Conclusions

Phase transition behaviour ubiquitous decision/optimization/... NP/PSpace/P/… random/real

Phase transition behaviour gives insight into problem hardness suggests new branching heuristics ideas like the backbone help understand branching

mistakes

57

Conclusions

AI becoming more of an experimental science? theory and experiment complement each other well increasing use of approximate/heuristic theories to

keep theory in touch with rapid experimentation

Phase transition behaviour is FUN lots of nice graphs as promised and it is teaching us lots about complexity and

algorithms!

58

Very partial bibliography

Cheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of IJCAI-91

Gent et al, The Constrainedness of Search, Proc. of AAAI-96Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88:359-358,

1996Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational

Intelligence, 14 (3), 1998Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems,

Proc. of AAAI-92Monasson et al, Determining computational complexity from characteristic

‘phase transitions’, Nature, 400, 1998Walsh, Search in a Small World, Proc. of IJCAI-99Watts & Strogatz, Collective dynamics of small world networks, Nature, 393,

1998

Phase transition behaviour Toby Walsh Dept of CS University of York.

Documents

walsh slide

unsat slide

real problems zx

ypolynomial problems

larger problems

z random

npcomplete problem zx

proper phase transition