Practical Partition- Based Theorem Proving for Large Knowledge Bases Bill MacCartney (Stanford KSL) Sheila A. McIlraith (Stanford KSL) Eyal Amir (UC Berkeley) Tomas Uribe (SRI) with thanks to Mark Stickel (SRI)
Dec 21, 2015
Practical Partition-BasedTheorem Proving
for Large Knowledge Bases
Bill MacCartney (Stanford KSL)Sheila A. McIlraith (Stanford KSL)
Eyal Amir (UC Berkeley)Tomas Uribe (SRI)
with thanks to Mark Stickel (SRI)
8/14/03Bill MacCartney, Stanford KSL 2
Motivation
• Goal: to enable automated reasoners to exploit the implicit structure of large knowledge bases
• Reasoners in big KBs face combinatorial explosion Making headway often requires KB-specific manual tuning
• But, large commonsense KBs contain structure Loosely-coupled clusters of domain knowledge
• Partitioning aims to speed reasoning by: Decomposing graph structure of KB into a tree of partitions Propagating results between partitions using message-passing Thereby, focusing proof search and ignoring the irrelevant
8/14/03Bill MacCartney, Stanford KSL 3
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 4
The espresso machine theory
(1) ok-pump on-pump water
(2) man-fill water
(3) man-fill on-pump
(4) man-fill on-pump
(5) water ok-boiler on-boiler steam
(6) water steam
(7) on-boiler steam
(8) ok-boiler steam
(9) steam coffee hot-drink
(10) steam tea hot-drink
(11) coffee tea
A simple KB of propositional logic
(we normally use first-order logic)
8/14/03Bill MacCartney, Stanford KSL 5
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 6
Automatic partitioning
Step 1: construct symbol graph Nodes are symbols in KB Edges connect nodes which appear together in an axiom Symbol graph captures structure of KB
(1) ok-pump on-pump water
(2) man-fill water
(3) man-fill on-pump
(4) man-fill on-pump
(5) water ok-boiler on-boiler steam
(6) water steam
(7) on-boiler steam
(8) ok-boiler steam
(9) steam coffee hot-drink
(10) steam tea hot-drink
(11) coffee tea hot-drink tea
coffee
steam
on-boiler
ok-boiler
water
on-pump
ok-pumpman-fill
8/14/03Bill MacCartney, Stanford KSL 7
Automatic partitioning
Step 2: construct tree decomposition Each node in tree decomposition corresponds to a
tightly-connected cluster of symbols a partition [Amir 2001] gives algorithm which approximates the
optimal decomposition by a factor O(log t)
steam
water
hot-drink tea
coffee
steam on-boiler
ok-boiler
water on-pump
ok-pumpman-fill
water
steam
hot-drink tea
coffee
steam
on-boiler
ok-boiler
water
on-pump
ok-pumpman-fill
steam
water
8/14/03Bill MacCartney, Stanford KSL 8
Automatic partitioning
Step 3: generate partition graph Allocate axioms to partitions according to vocabulary “Link languages” are defined by shared vocabularies Efficient reasoning depends on keeping link vocabularies
small
steam
water
hot-drink tea
coffee
steam on-boiler
ok-boiler
water on-pump
ok-pumpman-fill
water
steam
(1) ok-pump on-pump water(2) man-fill water(3) man-fill on-pump(4) man-fill on-pump
(5) water ok-boiler on-boiler steam(6) water steam(7) on-boiler steam(8) ok-boiler steam
(9) steam coffee hot-drink(10) steam tea hot-drink(11) coffee tea
steam
water
8/14/03Bill MacCartney, Stanford KSL 9
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 10
• Start with a tree-structured partition graph
Reasoning with MP
MP Algorithm[Amir & McIlraith 2000]
Pass messages in Li toward goal
• Identify goal partition(based on matching vocabulary)
• Direct edges toward goal(fixing outbound link language Li for each partition)
• Concurrently, in each partition: Generate consequences in Li
8/14/03Bill MacCartney, Stanford KSL 11
MP in action
A simple propositional theory
Theory {Q R S T U V W X Y Z}Partition 1 {Q R S T} Partition 2 {T U V W} Partition 3 {W X Y Z}{T} {W}
Partition 1 {Q R S T} Partition 2 {T U V W} Partition 3 {W X Y Z}{T} {W}
(1) Q R T(2) S T(3) S R
(4) S R
(5) T U V W(6) T W
(7) U W
(8) V W
(9) W X Z(10) X Y(11) W Y Z(15) Z
(12) Q (13) U
(14) V(16) R T(17) S T(18) T
(18) T
(19) U V W(20) V W(21) W
(21) W
(22) W Y Z(23) W Z(24) Z
(25)
Using partitioning, this query took just 10 resolution steps.
Using set-of-support, the same query can take 28 steps.
Query: Q U V Z ?
8/14/03Bill MacCartney, Stanford KSL 12
• Reasoning is performed locally in each partition
• Relevant results propagate toward goal partition
• Globally sound & complete… provided each local reasoner is sound & complete for Li-consequence finding [Amir & McIlraith 2000]
• Performance is worst-caseexponential within partitions, but linear in tree structure
Characteristics of MP
Minimizesbetween-partition
deduction
Supports parallel processing
Different reasonersin different partitions
Focuseswithin-partition
deduction
8/14/03Bill MacCartney, Stanford KSL 13
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 14
Experimental Evaluation of MP
• Do “real world” KBs exhibit inherent structure? Do they have good tree decompositions (partition graphs)? Can partition-based reasoning outperform other strategies?
• Experimental testbed Theorem prover: SNARK KB: Cyc
– A subset on spatial relationships, ~750 axioms, ~150 symbols– We’re working on adding SUMO, others
Queries from outside source Number of resolution steps used as chief performance metric Normalized to number of steps required using no strategy
8/14/03Bill MacCartney, Stanford KSL 15
Comparison to conventional strategies
• Restriction strategies focus proof search Disallow some resolution steps to speed search Completeness issues are critical
• Set-of-support restriction Place the negated query into a designated “set of support” Allow only resolutions involving a clause from the set of
support Add newly-derived clauses to set of support
• Ordering restriction Define a global ordering among predicates Resolve on predicates in order from greatest to least (SNARK provides a default ordering, which is arbitrary)
8/14/03Bill MacCartney, Stanford KSL 16
Experimental results: “vanilla” MP
0%
20%
40%
60%
80%
100%
MP Set-of-support Ordering
Re
so
luti
on
ste
ps
(v
s. n
o s
tra
teg
y)
q1
v5
p5
p7
vn2
p1
v10
p4
p3
Queries
8/14/03Bill MacCartney, Stanford KSL 17
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 18
Motivation for PDO
• Ordered resolution can be highly efficient
• Voronkov: best modern resolution provers use ordering to reduce search space
• But success depends on having the right ordering
• Until now, successful orderings have been Laboriously hand-crafted Tailored to a specific KB Poorly understood
• Insight: partitioning can induce a good ordering
8/14/03Bill MacCartney, Stanford KSL 19
How PDO works
• Generate a partition-derived ordering1. Direct edges of partition graph toward goal partition2. Perform topological sort on partitions3. Beginning with partitions furthest from goal, progressively append
symbols from each partition to ordering
• Use result as input for ordered resolution (Partition graph can now be
discarded) Sound & complete
• PDO roughly simulates MP
11
4
4
4
4
6
6
3
3
3
2
2
7
5
5
5
5
8/14/03Bill MacCartney, Stanford KSL 20
0%
20%
40%
60%
80%
100%
MP Set-of-support Ordering PDO
Re
so
luti
on
ste
ps
(v
s. n
o s
tra
teg
y)
q1
v5
p5
p7
vn2
p1
v10
p4
p3
Experimental results: PDO
Queries
8/14/03Bill MacCartney, Stanford KSL 21
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 22
MP with focused support (MFS)
• Motivating intuition Only results in the outbound link vocabulary can be
propagated So, focus within-partition reasoning on generating such
results
• The “focused support” restriction Initialize set S to contain any clause in the partition that
includes a symbol in outbound link language. Resolve two clauses only if one is in S and the resolved
predicate is not in outbound link language. Add the resolvent to S.
• MFS is globally sound & complete [see paper for proof]
8/14/03Bill MacCartney, Stanford KSL 23
Experimental results: MFS
0%
20%
40%
60%
80%
100%
MP Set-of-support PDO MFS
Re
so
luti
on
ste
ps
(v
s. n
o s
tra
teg
y)
q1
v5
p5
p7
vn2
p1
v10
p4
p3
Queries
8/14/03Bill MacCartney, Stanford KSL 24
Outline
• Background: partition-based reasoning Algorithms for automatic partitioning of large KBs The MP algorithm for reasoning with partitions
• Experimental evaluation of MP
• Partition-derived ordering (PDO) Automatic alternative to hand-crafted symbol orderings
• MP with focused support (MFS) Enhancing vanilla MP with a smart within-partition strategy
• Combinations of strategies Can outperform set-of-support by 10x or more
8/14/03Bill MacCartney, Stanford KSL 25
Strategy combinations
• Combine MP, PDO, or MFS with set-of-support Maintain a set of support at global level Allow resolution between two clauses only if they are in
the same partition and at least one of them is in the support
• Completeness These combinations are in general not complete Incompleteness sometimes revealed in practice
• Performance However, combinations outperform any single strategy
8/14/03Bill MacCartney, Stanford KSL 26
Experimental results: strategy combos
0%
20%
40%
60%
80%
100%
Set-of-support MP + SOS PDO + SOS MFS + SOS
Re
so
luti
on
ste
ps
(v
s. n
o s
tra
teg
y)
q1
v5
p5
p7
vn2
p1
v10
p4
p3
Queries
8/14/03Bill MacCartney, Stanford KSL 27
Experimental results: strategy combos
0%
20%
40%
60%
80%
100%
Set-of-support MP + SOS PDO + SOS MFS + SOS
Re
so
luti
on
ste
ps
(v
s. s
et-
of-
su
pp
ort
)
q1
v5
p5
p7
vn2
p1
v10
p4
p3
Queries
(same results, re-normalized vs. set-of-support)
8/14/03Bill MacCartney, Stanford KSL 28
Conclusions and Future Work
• Partitioning can speed up reasoning Exploits implicit structure of large commonsense KBs Reasoning becomes significantly more focused and
efficient MFS does even better by focusing reasoning within
partitions
• Partition-derived ordering is surprisingly effective Especially when combined with set-of-support Automatic alternative to hand-crafted orderings
• Future work Greater diversity of experimental results
Obstacle: scarcity of large KBs usable with generic FOL prover
Assessing the potential benefit of parallelization
8/14/03Bill MacCartney, Stanford KSL 29
Webwww.ksl.stanford.edu/projects/RKF/Partitioning/
Papers• MacCartney, B., McIlraith, S., Amir, E. and Uribe, T., “Practical Partition-Based
Theorem Proving for Large Knowledge Bases,” 18th International Joint Conference on Artificial Intelligence (IJCAI-03), 2003.
• Amir, E. and McIlraith, S., “Partition-Based Logical Reasoning for First-Order and Propositional Theories,” accepted for publication in Artificial Intelligence.
• McIlraith, S. and Amir, E., “Theorem Proving with Structured Theories,” 17th International Joint Conference on Artificial Intelligence (IJCAI-01), 2001.
• Amir, E., “Efficient Approximation for Triangulation of Minimum Treewidth,” 17th Conference on Uncertainty in Artificial Intelligence (UAI ’01), 2001.
• Amir, E. and McIlraith, S., “Solving Satisfiability using Decomposition and the Most Constrained Subproblem.” Proceedings of SAT 2001, 2001.
• Amir, E. and McIlraith, S., “Partition-Based Logical Reasoning,” 7th International Conference on Principles of Knowledge Representation and Reasoning (KR ’2000), 2000.
References
8/14/03Bill MacCartney, Stanford KSL 30
Thanks!
8/14/03Bill MacCartney, Stanford KSL 31
Results: automatic partitioning
• Partition graph is largely independent of query But edges may need to be redirected
• We’re experimenting with multiple algorithms
Alg 5 Alg 6
Number of partitions 124 40
Max symbols/partition 16 19
Max symbols/link 14 17
Max axioms/partition 80 95
Max partitions/axiom 25 28
Axioms in multiple partitions 152 152
8/14/03Bill MacCartney, Stanford KSL 32
Queries
hd-q1 If the pump is OK and the boiler is OK and the boiler is on, do we get a hot drink?
cyc-p5 If A and B are inside C, can C be inside A?
cyc-p7 If A and B are part of C and C is at D, where is A?
cyc-p1 Suppose that A is touching B and B is inside C and C is at D. Is A at D?
cyc-v5 A has parts B, C, and D. B has parts E, and F. Is F near A?
cyc-p3 If C is between A and B, and both A and B are inside D, and D is at E, is C at E?
cyc-p4 If C is between A and B, and both A and B are at D, is C also at D?
8/14/03Bill MacCartney, Stanford KSL 33
Automatic partitioning
8/14/03Bill MacCartney, Stanford KSL 34
MP in action
Query: If the pump is OK and the boiler is OK and the boiler is on, do we get a hot drink?
(1) ok-pump on-pump water(2) man-fill water(3) man-fill on-pump(4) man-fill on-pump
(5) water ok-boiler on-boiler steam(6) water steam(7) on-boiler steam(8) ok-boiler steam
(9) steam coffee hot-drink(10) steam tea hot-drink(11) coffee tea
steam
water
(12) ok-pump
(13) ok-boiler
(14) on-boiler
(15) hot-drink
8/14/03Bill MacCartney, Stanford KSL 35
(1) ok-pump on-pump water(2) man-fill water(3) man-fill on-pump(4) man-fill on-pump
(5) water ok-boiler on-boiler steam(6) water steam(7) on-boiler steam(8) ok-boiler steam
(9) steam coffee hot-drink(10) steam tea hot-drink(11) coffee tea
steam
water
(12) ok-pump
(13) ok-boiler
(14) on-boiler
(15) hot-drink
MP in action
(16) on-pump water
(17) man-fill water
(18) water
water
steam
(19) ok-boiler on-boiler steam
(20) steam
(21) steam tea hot-drink
(22) steam hot-drink
(23) hot-drink
Using set-of-support, SNARK took 28 steps to prove this.Using partitioning, SNARK took just 11 steps.
(24)
8/14/03Bill MacCartney, Stanford KSL 36
Ongoing research
• Testing on more KBs Finding good test data is a real challenge
• Characterizing the queries for which MP and its extensions work especially well
• Assessing the potential benefit of parallelization Current implementation is serial But reasoning within partitions can happen
concurrently
• Distributed implementations Demonstrating integration of heterogeneous reasoners
8/14/03Bill MacCartney, Stanford KSL 37
Recap: automatic partitioning
• Begin with a KB in PL or FOL
Efficient reasoning depends on keeping
partition sizes and link sizes small
• Construct symbol graph Edges join symbols which appear together in an
axiom
• Apply tree decomposition algorithm We use an adaptation of min-fill
• Partition axioms correspondingly Each partition has its own vocabulary “Link languages” are defined by shared vocabulary