A Scalable Algorithm for Minimal Unsatisfiable Core Extraction Nachum Dershowitz¹ Ziyad Hanna² Alexander Nadel¹, ² 1 Tel-Aviv University 2 Intel SAT’06.

A Scalable Algorithm for Minimal Unsatisfiable Core Extraction

Nachum Dershowitz¹Ziyad Hanna²Alexander Nadel¹,²

1Tel-Aviv University2Intel

SAT’06 Conference, Seattle; 12.08.2006

Agenda

Introduction Related Work Complete Resolution Refutation (CRR)

Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results

What is unsatisfiable core extraction? Given an unsatisfiable CNF formula:

Introduction

clause negative literal

positive literal

F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )

An unsat. core is an unsatisfiable subset of its clauses:

F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )

Introduction


U1 = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( ( b b + c+ c ))

U2 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )

U3 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬( ¬a a + + c c )) ( b + c )

An unsat. core is an unsatisfiable subset of its clauses:

F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )

Introduction


U1 = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( ( b b + c+ c ))

U2 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )

U3 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬( ¬a a + + c c )) ( b + c ) Core is minimal if removal of any clause makes it satisfiable

U1 and U3 are minimal U2 is not minimal, since U3 U2

Introduction

Our contribution: A Minimal Unsatisfiable Core (MUC) extraction algorithm

practical: handles Formal Verification benchmarks

faster than MUC algorithms

smaller cores than suboptimal methods

Agenda



Related Work

Theoretical algorithms Suboptimal algorithms

Adaptive core search (Bruni et al., 2001) AMUSE (Oh et al., 2004) Empty-clause Cone (EC) (Zhang et al., 2003;

Goldberg et al., 2003) Algorithms, guaranteeing minimality of the

core MUP (Huang, 2005) Naïve

Related Work (Suboptimal)

Empty-clause Cone (EC) (Zhang et al. 2003; Goldberg et al. 2003) Modern SAT solvers produce a resolution

refutation of given unsatisfiable formula Each conflict clause is a resolvent of initial

clauses or previously recorded conflict clauses The empty clause is the last conflict clause Initial clauses, connected to the empty clause,

compose the unsatisfiable core

Related Work (Suboptimal)

Empty-clause Cone until Fixed Point (EC-fp) (Zhang et. all; 2003) Invoke EC until fixed point is reached

EC and EC-fp characteristic Fast and scalable

The only algorithms scalable on large benchmarks The resulting cores can still be reduced

Related Work (Naïve-MUC)

Naïve MUC For every clause I in formula F

Invoke SAT solver on F \ I If F \ I is unsatisfiable

I does belong to MUC Remove I from F

F is a Minimal Unsatisfiable Core

Agenda



CRR and Naïve

Naïve is the most efficient MUC algorithm on large FV benchmarks

CRR can be seen as a refinement of Naïve Always hold a resolution refutation of current unsat. core Check if it is possible to exclude an initial clause I by

invoking a SAT solver on both Remaining initial clauses, except I (like Naïve) Conflict clauses, s.t. I was not required to derive them

If I can be excluded, a new resolution refutation, not containing I, is constructed

Complete Resolution Refutation (CRR) Algorithm: Resolution Refutation Resolution refutation is a directed acyclic

graph (dag) R: R( In Co , E )

Initial clauses - sources of R

Conflict clauses, including - the only sink of R

Edges – resolution relations between clauses

Complete Resolution Refutation (CRR) Algorithm: Definitions Re(R, I) / ReE(R, I) / ReG(R, I)

vertices / edges / sub-graph reachable from I in R

UnRe(R, I) – vertices, unreachable from I in R

A resolution refutation, containing only clauses, connected to , is non-redundant

CRR by Example

b

a c

a c

c b

a b d

a d b

a d b

a b d

CRR by example Initial clauses are on the right

I1

I2

I3

I4

I5

I6

I7

I8

CRR by Example

b

a c

a c

c b

a b d

a d b

a d b

a d

b d

a b

a

Build non-redundant resolution refutation One initial clause is dropped

I2

I3

I4

I5

I6

I7

I8

C2

C3

C4

C5

C6

CRR by Example

b

a c

a c

c b

a b d

a d b

a d b

a d

b d

a b

a

Consider clause I8 for removalI2

I3

I4

I5

I6

I7

I8

I2

I3

I4

I5

I6

I7

I8

C2

C3

C4

C5

C6

CRR by Example

b

a c

a c

c b

a b d

a d b

a d b

a d

b d

a b

a

UnRe(I8)

Consider clause I8 for removal Invoke SAT solver

on I’ = UnRe(I8)

I2

I3

I4

I5

I6

I7

I8

C2

C3

C4

C5

C6

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

Invoke SAT solver on I’ = UnRe(I8) Doesn’t know about resolution relation

between clauses

I’1

I’2

I’3

I’4

I’5

I’6

I’7

I’8

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

The instance is unsatisfiable

a b

I’1

I’2

I’3

I’4

I’5

I’6

I’7

I’8

C’2

C’3

a b

C’1

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

A new refutation R’ is composed ReG(I8) is dropped

a b

I2

I3

I4

I5

I6

I7

C3

C5

C7

C8

a b

C9

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

Make R’ non-redundant

a b

I2

I3

I4

I5

I6

I7

C3

C5

C7

C8

a b

C9

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

Make R’ non-redundant

a b

I2

I3

I4

I5

I6

I7

C3

C5

C7

C8

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

Consider I7 for removal

a b

I2

I3

I4

I5

I6

I7

C3

C5

C7

C8

UnRe(I7)

I’6

I’7

CRR by Example

b

a c

a c

c b

a b d

b d

a

UnRe(I7) is satisfiable with a=b=c=d=0I’1

I’2

I’3

I’4

I’5

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

I7 is marked as belonging to a MUC The refutation is not changed

a b

I2

I3

I4

I5

I6

I7 +

C3

C5

C7

C8

CRR by Example

b

a c

a c

c b

a b d

a d b

b d

a

Every other initial clause also belongs to MUC

a b

I2 +

I3 +

I4 +

I5 +

I6 +

I7 +

C3

C5

C7

C8

Complete Resolution Refutation (CRR) Algorithm1. Build a resolution refutation R(In Co; E) using a SAT solver 2. Reduce R(In Co; E) to be non-redundant 3. While unmarked clause exists in In

1. I PickUnmarkedClause(In)2. Invoke a SAT solver on UnRe(R, I)3. If UnRe(R, I) is satisfiable then

1. Mark I as MUC member

4. else1. Let R’(In’ Co’; E’) be resolution refutation, built by the solver2. In In \ {I}; Co (Co Co’) \ Re(R, I); E (E E’) \ ReE(R, I)3. Reduce R(In Co; E) to be non-redundant

4. Return In

CRR vs. Naïve

CRR reuses all relevant conflict clauses No need to re-derive important lemmas

CRR may remove a number of initial clauses simultaneously While reducing the resolution refutation to be non-

redundant (at each stage of the algorithm)

CRR: More Features

CRR can be stopped anytime after the first resolution refutation is constructed Accepts time thresholds

There is a place for improvement Work on the heuristic for picking clauses Hold the resolution refutation in-memory, rather

than on disk Resolution-Refutation-based Pruning

Next

Agenda



Resolution Refutation-based Pruning For each I, speed-up the examination if I can

be removed by

Using a certain property of ReG(I) to cut-off the search space for the SAT solver, invoked on UnRe(I)

RRP: Definitions

Definitions An assignment falsifies clause I, if every literal

of I is 0 under = {a=0; b=0; c=1} falsifies I = a b c

We define an i-path in a resolution refutation to be a directed path starting with an initial clause an ending with the empty clause

An assignment falsifies an i-path, if it falsifies every clause in the i-path

RRP: Main Theorem

Theorem: Let R(I V, E) be a resolution refutation. Let be

an assignment. If satisfies UnRe(I), then there exists an i-path,

starting with I, falsified by .

Note: ReG(I) contains every i-path, starting with I

RRP: Main Theorem by Example There is one i-path,

starting with I7: {I7, C7, C8}

Any assignment satisfying UnRe(I7) falsifies the clauses I7,, C7 and C8

Must have {a=0; d=0; b=0}

Otherwise, would satisfy a vertex cut in R The empty clause is derivable from

any vertex cut in R. Contradiction.

b

a c

a c

c b

a b d

a d b

b d

a

a b

I2

I3

I4

I5

I6

C3

C5

C7

C8

I7

UnRe(I7)

i-path

i-path

RRP: Theorem Application

The SAT should check if there is a model to UnRe(I)

All the possible models of UnRe(I) must falsify some i-path in ReG(I)

Restrict the SAT solver to check only such assignments that falsify some i-path in ReG(I)

RRP

Decision heuristic first invokes RRPH function RRPH explores ReG(I) in DFS manner

Always is trying to falsify a certain i-path

If RRPH returns a literal, it is picked as a decision literal, otherwise

A normal decision heuristic is invoked

RRPB – a change in backtracking engine

The currently visited clause D ReG, initialized to I, is maintained by RRPH and RRPB

RRPH: Decision Heuristic

Norm

D is not satisfied nor falsified / Return a negation of an unassigned literal from D

Sat False

EoT EoP

D has a parent / D Par(D)

D is

sat

isfie

d

D is falsifiedAll visited /

D Par(D)

D has an unvisited child / D Child(D)

D has no parentD has no children

True / Return ?

True / Return ?

RRPB: Backtracking Engine

On conflict, the solver may need to backtrack in ReG(C) in addition to regular backtracking

Let backtracking level (in search space) be bl Denote by mdl(D) the maximal decision level

of D’s literals If bl < mdl(D)

Let B be the first predecessor of D, such that bl mdl(B)

D B

Agenda



Experimental Results

We demonstrate that for benchmark Formal Verification families: Our algorithm runs faster than other algorithms for

MUC extraction Our algorithm finds smaller cores compared to the

sub-optimal algorithms

Experimental Results

We implemented CRR and RRP in a simplified version of the industrial solver Eureka

We used 4 Formal Verification families Barrel; Longmult; Fvp-unsat.2.0; Pipe_unsat_1.0

Relative resolution hardness of a resolution refutation R( In Co , E ) is

( | In | + | Co | ) / | In |

Experimental Results: InstancesInst Var Cls EC

R.R. Hrd.

4pipe 4237 80213 1.4

4p_1_o 4647 74554 1.7

4p_2_o 4941 82207 1.7

4p_3_o 5233 89473 1.6

4p_4_o 5525 96480 1.6

3p_k 2391 27405 1.5

4p_k 5095 79489 1.5

5p_k 5525 189109 1.4

Inst Var Cls EC R.R. Hrd.

barrel5 1407 5383 1.8

barrel6 2306 8931 1.8

barrel7 3523 13765 1.9

barrel8 5106 20083 1.8

longmult4 1966 6069 2.6

longmult5 2397 7431 3.6

longmult6 2848 8853 5.6

longmult7 3319 10335 14.2

0

10000

20000

30000

40000

50000

60000

70000

80000

Tim

e i

n s

ec

on

ds

(t/

o i

s

86

40

0)

4p

4p

14

p2

4p

34

p4

3p

k4

pk

5p

kb

5b

6b

7b

8 l4 l5 l6 l7

F F F F F P P P B B B B L L L L

Instance

Comparing MUC Algorithms: Comparison by Time

CRR-RRP CRR-plain EC-fp+Naïve

Experimental Results: MUC Algorithms CRR vs. Naive

Plain CRR outperforms Naïve on every benchmark

CRR+RRP outperforms Naïve on 15/16 benchmarks

The speed-up is Usually, between 4 to 10x Sometimes, it is 34x (hardest barrel instance) Sometimes, it is 2.5x (hardest longmult instance)

Experimental Results: MUC Algorithms RRP Impact

RRP improves the performance on most instances

The greatest speed-up is ~2.5x RRP is usually unhelpful only on longmult family

Experimental Results: MUC Algorithms logmult family case

Hard for CRR, even harder for RRP Reason is relative resolution hardness

Reaches 14.2 for the hardest longmult instance Varies between 1.4-1.9 on every instance of other

families

Sizes of cores do not vary much between different MUC algorithms

Experimental Results: Suboptimal Algorithms

Next: Compare CRR and CRR+RRP with sub-optimal algorithms EC and EC-fp

Comparing UC Algorithms: Comparison by Core Size

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

4p

4p

1

4p

2

4p

3

4p

4

3p

k

4p

k

5p

k

b5

b6

b7

b8 l4 l5 l6 l7


Instance

Co

re S

ize

CRR-RRP CRR-plain EC EC-fp

Comparing UC Algorithms: Comparison by Core Size

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

4p

4p

1

4p

2

4p

3

4p

4

3p

k

4p

k

5p

k

b5

b6

b7

b8 l4 l5 l6 l7


Instance

Co

re S

ize


02000400060008000

100001200014000160001800020000

Tim

e i

n s

ec

on

ds

4p

4p

1

4p

24

p3

4p

43

pk

4p

k

5p

kb

5

b6

b7

b8 l4 l5 l6 l7


Instance

Comparing UC Algorithms: Comparison by Time


02000400060008000

100001200014000160001800020000

Tim

e i

n s

ec

on

ds

4p

4p

1

4p

24

p3

4p

43

pk

4p

k

5p

kb

5

b6

b7

b8 l4 l5 l6 l7


Instance

Comparing UC Algorithms: Comparison by Time


Experimental Results: CRR vs. Suboptimal Algorithms CRR+RRP vs. suboptimal algorithms

Core sizes Average gain over EC is 30% Average gain over EC-fp is 11%

Execution time Usually, EC and EC-fp are orders of magnitude faster,

but CRR+RRP is faster than EC-fp on two hardest instances

of barrel

Conclusions

We presented: Complete Resolution Refutation (CRR) algorithm for

Minimal Unsatisfiable Core extraction Resolution-Refutation-based pruning (RRP), enhancing

CRR Our algorithm is:

Faster than existing MUC algorithms by a factor of 6 (or more) on large problems with non-overly hard resolution proofs

Able to find smaller cores than suboptimal algorithms by 11% on average

Thanks!

A Scalable Algorithm for Minimal Unsatisfiable Core Extraction Nachum Dershowitz¹ Ziyad Hanna² Alexander Nadel¹, ² 1 Tel-Aviv University 2 Intel SAT’06.

Documents

c b c slide

b b c c

nave slide

c b c introduction

unsatisfiable formula

new resolution refutation

unsatisfiable subset

unsatisfiable cnf formula