A Scalable Algorithm for Minimal Unsatisfiable Core Extraction Nachum Dershowitz¹ Ziyad Hanna² Alexander Nadel¹ , ² 1 Tel-Aviv University 2 Intel SAT’06 Conference, Seattle; 12.08.2006
Dec 16, 2015
A Scalable Algorithm for Minimal Unsatisfiable Core Extraction
Nachum Dershowitz¹Ziyad Hanna²Alexander Nadel¹,²
1Tel-Aviv University2Intel
SAT’06 Conference, Seattle; 12.08.2006
Agenda
Introduction Related Work Complete Resolution Refutation (CRR)
Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results
What is unsatisfiable core extraction? Given an unsatisfiable CNF formula:
Introduction
clause negative literal
positive literal
F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )
An unsat. core is an unsatisfiable subset of its clauses:
F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )
Introduction
What is unsatisfiable core extraction? Given an unsatisfiable CNF formula:
U1 = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( ( b b + c+ c ))
U2 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )
U3 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬( ¬a a + + c c )) ( b + c )
An unsat. core is an unsatisfiable subset of its clauses:
F = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )
Introduction
What is unsatisfiable core extraction? Given an unsatisfiable CNF formula:
U1 = ( a + b ) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( ( b b + c+ c ))
U2 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬a + c ) ( b + c )
U3 = ( ( a a + b+ b )) ( ¬b + c ) ( ¬c ) ( ¬( ¬a a + + c c )) ( b + c ) Core is minimal if removal of any clause makes it satisfiable
U1 and U3 are minimal U2 is not minimal, since U3 U2
Introduction
Our contribution: A Minimal Unsatisfiable Core (MUC) extraction algorithm
practical: handles Formal Verification benchmarks
faster than MUC algorithms
smaller cores than suboptimal methods
Agenda
Introduction Related Work Complete Resolution Refutation (CRR)
Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results
Related Work
Theoretical algorithms Suboptimal algorithms
Adaptive core search (Bruni et al., 2001) AMUSE (Oh et al., 2004) Empty-clause Cone (EC) (Zhang et al., 2003;
Goldberg et al., 2003) Algorithms, guaranteeing minimality of the
core MUP (Huang, 2005) Naïve
Related Work (Suboptimal)
Empty-clause Cone (EC) (Zhang et al. 2003; Goldberg et al. 2003) Modern SAT solvers produce a resolution
refutation of given unsatisfiable formula Each conflict clause is a resolvent of initial
clauses or previously recorded conflict clauses The empty clause is the last conflict clause Initial clauses, connected to the empty clause,
compose the unsatisfiable core
Related Work (Suboptimal)
Empty-clause Cone until Fixed Point (EC-fp) (Zhang et. all; 2003) Invoke EC until fixed point is reached
EC and EC-fp characteristic Fast and scalable
The only algorithms scalable on large benchmarks The resulting cores can still be reduced
Related Work (Naïve-MUC)
Naïve MUC For every clause I in formula F
Invoke SAT solver on F \ I If F \ I is unsatisfiable
I does belong to MUC Remove I from F
F is a Minimal Unsatisfiable Core
Agenda
Introduction Related Work Complete Resolution Refutation (CRR)
Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results
CRR and Naïve
Naïve is the most efficient MUC algorithm on large FV benchmarks
CRR can be seen as a refinement of Naïve Always hold a resolution refutation of current unsat. core Check if it is possible to exclude an initial clause I by
invoking a SAT solver on both Remaining initial clauses, except I (like Naïve) Conflict clauses, s.t. I was not required to derive them
If I can be excluded, a new resolution refutation, not containing I, is constructed
Complete Resolution Refutation (CRR) Algorithm: Resolution Refutation Resolution refutation is a directed acyclic
graph (dag) R: R( In Co , E )
Initial clauses - sources of R
Conflict clauses, including - the only sink of R
Edges – resolution relations between clauses
Complete Resolution Refutation (CRR) Algorithm: Definitions Re(R, I) / ReE(R, I) / ReG(R, I)
vertices / edges / sub-graph reachable from I in R
UnRe(R, I) – vertices, unreachable from I in R
A resolution refutation, containing only clauses, connected to , is non-redundant
CRR by Example
b
a c
a c
c b
a b d
a d b
a d b
a b d
CRR by example Initial clauses are on the right
I1
I2
I3
I4
I5
I6
I7
I8
CRR by Example
b
a c
a c
c b
a b d
a d b
a d b
a d
b d
a b
a
Build non-redundant resolution refutation One initial clause is dropped
I2
I3
I4
I5
I6
I7
I8
C2
C3
C4
C5
C6
CRR by Example
b
a c
a c
c b
a b d
a d b
a d b
a d
b d
a b
a
Consider clause I8 for removalI2
I3
I4
I5
I6
I7
I8
I2
I3
I4
I5
I6
I7
I8
C2
C3
C4
C5
C6
CRR by Example
b
a c
a c
c b
a b d
a d b
a d b
a d
b d
a b
a
UnRe(I8)
Consider clause I8 for removal Invoke SAT solver
on I’ = UnRe(I8)
I2
I3
I4
I5
I6
I7
I8
C2
C3
C4
C5
C6
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
Invoke SAT solver on I’ = UnRe(I8) Doesn’t know about resolution relation
between clauses
I’1
I’2
I’3
I’4
I’5
I’6
I’7
I’8
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
The instance is unsatisfiable
a b
I’1
I’2
I’3
I’4
I’5
I’6
I’7
I’8
C’2
C’3
a b
C’1
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
A new refutation R’ is composed ReG(I8) is dropped
a b
I2
I3
I4
I5
I6
I7
C3
C5
C7
C8
a b
C9
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
Make R’ non-redundant
a b
I2
I3
I4
I5
I6
I7
C3
C5
C7
C8
a b
C9
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
Make R’ non-redundant
a b
I2
I3
I4
I5
I6
I7
C3
C5
C7
C8
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
Consider I7 for removal
a b
I2
I3
I4
I5
I6
I7
C3
C5
C7
C8
UnRe(I7)
I’6
I’7
CRR by Example
b
a c
a c
c b
a b d
b d
a
UnRe(I7) is satisfiable with a=b=c=d=0I’1
I’2
I’3
I’4
I’5
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
I7 is marked as belonging to a MUC The refutation is not changed
a b
I2
I3
I4
I5
I6
I7 +
C3
C5
C7
C8
CRR by Example
b
a c
a c
c b
a b d
a d b
b d
a
Every other initial clause also belongs to MUC
a b
I2 +
I3 +
I4 +
I5 +
I6 +
I7 +
C3
C5
C7
C8
Complete Resolution Refutation (CRR) Algorithm1. Build a resolution refutation R(In Co; E) using a SAT solver 2. Reduce R(In Co; E) to be non-redundant 3. While unmarked clause exists in In
1. I PickUnmarkedClause(In)2. Invoke a SAT solver on UnRe(R, I)3. If UnRe(R, I) is satisfiable then
1. Mark I as MUC member
4. else1. Let R’(In’ Co’; E’) be resolution refutation, built by the solver2. In In \ {I}; Co (Co Co’) \ Re(R, I); E (E E’) \ ReE(R, I)3. Reduce R(In Co; E) to be non-redundant
4. Return In
CRR vs. Naïve
CRR reuses all relevant conflict clauses No need to re-derive important lemmas
CRR may remove a number of initial clauses simultaneously While reducing the resolution refutation to be non-
redundant (at each stage of the algorithm)
CRR: More Features
CRR can be stopped anytime after the first resolution refutation is constructed Accepts time thresholds
There is a place for improvement Work on the heuristic for picking clauses Hold the resolution refutation in-memory, rather
than on disk Resolution-Refutation-based Pruning
Next
Agenda
Introduction Related Work Complete Resolution Refutation (CRR)
Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results
Resolution Refutation-based Pruning For each I, speed-up the examination if I can
be removed by
Using a certain property of ReG(I) to cut-off the search space for the SAT solver, invoked on UnRe(I)
RRP: Definitions
Definitions An assignment falsifies clause I, if every literal
of I is 0 under = {a=0; b=0; c=1} falsifies I = a b c
We define an i-path in a resolution refutation to be a directed path starting with an initial clause an ending with the empty clause
An assignment falsifies an i-path, if it falsifies every clause in the i-path
RRP: Main Theorem
Theorem: Let R(I V, E) be a resolution refutation. Let be
an assignment. If satisfies UnRe(I), then there exists an i-path,
starting with I, falsified by .
Note: ReG(I) contains every i-path, starting with I
RRP: Main Theorem by Example There is one i-path,
starting with I7: {I7, C7, C8}
Any assignment satisfying UnRe(I7) falsifies the clauses I7,, C7 and C8
Must have {a=0; d=0; b=0}
Otherwise, would satisfy a vertex cut in R The empty clause is derivable from
any vertex cut in R. Contradiction.
b
a c
a c
c b
a b d
a d b
b d
a
a b
I2
I3
I4
I5
I6
C3
C5
C7
C8
I7
UnRe(I7)
i-path
i-path
RRP: Theorem Application
The SAT should check if there is a model to UnRe(I)
All the possible models of UnRe(I) must falsify some i-path in ReG(I)
Restrict the SAT solver to check only such assignments that falsify some i-path in ReG(I)
RRP
Decision heuristic first invokes RRPH function RRPH explores ReG(I) in DFS manner
Always is trying to falsify a certain i-path
If RRPH returns a literal, it is picked as a decision literal, otherwise
A normal decision heuristic is invoked
RRPB – a change in backtracking engine
The currently visited clause D ReG, initialized to I, is maintained by RRPH and RRPB
RRPH: Decision Heuristic
Norm
D is not satisfied nor falsified / Return a negation of an unassigned literal from D
Sat False
EoT EoP
D has a parent / D Par(D)
D is
sat
isfie
d
D is falsifiedAll visited /
D Par(D)
D has an unvisited child / D Child(D)
D has no parentD has no children
True / Return ?
True / Return ?
RRPB: Backtracking Engine
On conflict, the solver may need to backtrack in ReG(C) in addition to regular backtracking
Let backtracking level (in search space) be bl Denote by mdl(D) the maximal decision level
of D’s literals If bl < mdl(D)
Let B be the first predecessor of D, such that bl mdl(B)
D B
Agenda
Introduction Related Work Complete Resolution Refutation (CRR)
Algorithm Resolution-Refutation-based Pruning (RRP) Experimental Results
Experimental Results
We demonstrate that for benchmark Formal Verification families: Our algorithm runs faster than other algorithms for
MUC extraction Our algorithm finds smaller cores compared to the
sub-optimal algorithms
Experimental Results
We implemented CRR and RRP in a simplified version of the industrial solver Eureka
We used 4 Formal Verification families Barrel; Longmult; Fvp-unsat.2.0; Pipe_unsat_1.0
Relative resolution hardness of a resolution refutation R( In Co , E ) is
( | In | + | Co | ) / | In |
Experimental Results: InstancesInst Var Cls EC
R.R. Hrd.
4pipe 4237 80213 1.4
4p_1_o 4647 74554 1.7
4p_2_o 4941 82207 1.7
4p_3_o 5233 89473 1.6
4p_4_o 5525 96480 1.6
3p_k 2391 27405 1.5
4p_k 5095 79489 1.5
5p_k 5525 189109 1.4
Inst Var Cls EC R.R. Hrd.
barrel5 1407 5383 1.8
barrel6 2306 8931 1.8
barrel7 3523 13765 1.9
barrel8 5106 20083 1.8
longmult4 1966 6069 2.6
longmult5 2397 7431 3.6
longmult6 2848 8853 5.6
longmult7 3319 10335 14.2
0
10000
20000
30000
40000
50000
60000
70000
80000
Tim
e i
n s
ec
on
ds
(t/
o i
s
86
40
0)
4p
4p
14
p2
4p
34
p4
3p
k4
pk
5p
kb
5b
6b
7b
8 l4 l5 l6 l7
F F F F F P P P B B B B L L L L
Instance
Comparing MUC Algorithms: Comparison by Time
CRR-RRP CRR-plain EC-fp+Naïve
Experimental Results: MUC Algorithms CRR vs. Naive
Plain CRR outperforms Naïve on every benchmark
CRR+RRP outperforms Naïve on 15/16 benchmarks
The speed-up is Usually, between 4 to 10x Sometimes, it is 34x (hardest barrel instance) Sometimes, it is 2.5x (hardest longmult instance)
Experimental Results: MUC Algorithms RRP Impact
RRP improves the performance on most instances
The greatest speed-up is ~2.5x RRP is usually unhelpful only on longmult family
Experimental Results: MUC Algorithms logmult family case
Hard for CRR, even harder for RRP Reason is relative resolution hardness
Reaches 14.2 for the hardest longmult instance Varies between 1.4-1.9 on every instance of other
families
Sizes of cores do not vary much between different MUC algorithms
Experimental Results: Suboptimal Algorithms
Next: Compare CRR and CRR+RRP with sub-optimal algorithms EC and EC-fp
Comparing UC Algorithms: Comparison by Core Size
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
4p
4p
1
4p
2
4p
3
4p
4
3p
k
4p
k
5p
k
b5
b6
b7
b8 l4 l5 l6 l7
F F F F F P P P B B B B L L L L
Instance
Co
re S
ize
CRR-RRP CRR-plain EC EC-fp
Comparing UC Algorithms: Comparison by Core Size
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
4p
4p
1
4p
2
4p
3
4p
4
3p
k
4p
k
5p
k
b5
b6
b7
b8 l4 l5 l6 l7
F F F F F P P P B B B B L L L L
Instance
Co
re S
ize
CRR-RRP CRR-plain EC EC-fp
02000400060008000
100001200014000160001800020000
Tim
e i
n s
ec
on
ds
4p
4p
1
4p
24
p3
4p
43
pk
4p
k
5p
kb
5
b6
b7
b8 l4 l5 l6 l7
F F F F F P P P B B B B L L L L
Instance
Comparing UC Algorithms: Comparison by Time
CRR-RRP CRR-plain EC EC-fp
02000400060008000
100001200014000160001800020000
Tim
e i
n s
ec
on
ds
4p
4p
1
4p
24
p3
4p
43
pk
4p
k
5p
kb
5
b6
b7
b8 l4 l5 l6 l7
F F F F F P P P B B B B L L L L
Instance
Comparing UC Algorithms: Comparison by Time
CRR-RRP CRR-plain EC EC-fp
Experimental Results: CRR vs. Suboptimal Algorithms CRR+RRP vs. suboptimal algorithms
Core sizes Average gain over EC is 30% Average gain over EC-fp is 11%
Execution time Usually, EC and EC-fp are orders of magnitude faster,
but CRR+RRP is faster than EC-fp on two hardest instances
of barrel
Conclusions
We presented: Complete Resolution Refutation (CRR) algorithm for
Minimal Unsatisfiable Core extraction Resolution-Refutation-based pruning (RRP), enhancing
CRR Our algorithm is:
Faster than existing MUC algorithms by a factor of 6 (or more) on large problems with non-overly hard resolution proofs
Able to find smaller cores than suboptimal algorithms by 11% on average
Thanks!