The TM System for Repairing Non-Theorems Alison Pease – University of Edinburgh Simon Colton – Imperial College, London
The TM System forRepairing Non-Theorems
Alison Pease – University of Edinburgh
Simon Colton – Imperial College, London
Infeasible but Illustrative…
• Child interested in ATP– Says: “All prime numbers are odd”– ATP replies: “No. Go away.”
• Much more intelligent to say:– “You’re not quite right. In fact, all
primes except two are odd”
Possibly More Feasible• 1st year maths student:– “All groups are Abelian”
• ATP: – “No”
• Model Generator/Constraint Solver:– Here’s a non-Abelian group, you idiot
• Clever reasoning system: – “No, but self inverse groups are Abelian
and have you looked at cyclic groups?”
Inspiration from Imre Lakatos
• Philosophy of maths– Fallibilistic approach,
theory is fluid
• Important book:– Proofs and refutations
• Two strands– Methods for dealing with
counterexamples– Social aspect to theory
formation process
• Running example– Euler’s theorem
Motivations for TM Project
1. Implement • Lakatos’s philosophy of maths
2. Integrate • Reasoning systems
3. Improve ATP systems• To be more robust/flexible• Enable more organic growing of theories
4. Show the HR system working in ATP• Shown effective for ML and CSPs• Theorem proving seen as
• A starting point for a discovery session
PhD Project of Alison Pease
• Aims to (and achieves)– The automation of
• Lakatos’s methods for handling counterexamples• The social aspect of theory formation
• Perspectives:– Computational philosophy– Scientific discovery– Improvement of AI techniques
• Automated Theory Formation, Automated Reasoning
Spin off from Alison’s workThe TM System
• A system for handling non-theorems– By modifying them into theorems
• Using methods inspired by Lakatos– Less interested here in the social aspect
• TM is a wrapper for 3rd party software– Otter (ATP), MACE (Model generator) – HR (Machine learning – see later)
• Used so far only in algebraic domains
Some of Lakatos’s Methods
• Counterexample barring:– Alter conjecture to explicitly exclude each
counterexample• Primes except 2 are odd
• Piecemeal exclusion:– Exclude an entire class of examples
• Primes except powers of 2 are odd
• Strategic withdrawal– Specialise to a subset with no counters
• Mersenne primes are odd
• See paper for formal description of these
The TM SystemInput & Output
• Input– Conjectures of the form A C
• A are axioms, C is conjecture statement
– Given in Otter format• Axioms first, last line is conjecture
• Output– Proof of the original if true, or:– Modified theorems
• Of the form A M C• Which are proved and probably not obvious
Our Inspiring Example
• Input non-theoremall a b c ((a*b)*c = a*(b*c)).exists id (all a (a*id = id *a = a)).all a exists b (a * b = b * a = id).-(all a b (a * b = b * a)).
• Output modified theoremall a b c ((a*b)*c = a*(b*c)).exists id (all a (a*id = id *a = a)).all a exists b (a * b = b * a = id).all a (a * a = id).-(all a b (a * b = b * a)).
The TM SystemOverview
• Five stages:– Preliminary checks
• Using Otter
– Forming supporting and falsifying models• Using MACE
– Forming a theory• Using HR
– Extracting modifications and proving them• TM does this using Otter
– Flagging possibly obvious modifications• TM does this
Stage 1: Preliminary Checks
• Otter is used to attempt to prove– (i) A C
• No modification required
– (ii) A ¬C• No specialisation will help
– (iii) A (Triv C)• True only for trivial algebras
– (iv) A (¬Triv C)• True only for non-trivial algebras
• Last two are inspired by Lakatos’s counterexample barring methods
Stage 2: Model Generation
• Falsifying examples generated–MACE given Axioms + ¬C
• Supporting examples generated–MACE given Axioms + C
• 10 seconds allowed– For each size 1 to 8– User can alter these settings
Stage 3: Theory Formation
• Forming specialisations– Done inductively (not math. induction)
• Predictive learning task– Positive and negative examples of a concept
• Learn a definition for positives
– We want plenty of answers• So we want a descriptive rather than predictive system
• TM uses the HR program to form a theory– Using concepts from axioms as background– And examples as objects of interest
The HR Program (since 1997)• Descriptive induction program– Works mostly in mathematics domain– Also, bioinformatics, vision, music recently
• Forms a scientific theory – Given a small amount of knowledge– E.g., how to divide two numbers, ring axioms
• Theories contain– Example, concepts, conjectures, proofs
• Main features:– Production rules, measures of interestingness,– Empirical conjecture making, – Using reasoning programs (Otter, MACE, …)
Stage 4: Formation of Modifications
• HR’s theories contain many specialisations– E.g., self-inverse, idempotent, Abelian,
• Some specialisation are true of– A subset of the supporting examples
• But no falsifying examples
– Such specialisation are added as an axiom• (Axioms + Specialisation) Conjecture
– This is strategic withdrawal• And also piecemeal exclusion (HR’s negate rule)
• Otter is used to prove each modified theorem – Time allowed varied by the user
Stage 5Identifying Potentially Dull Results
• It’s quite easy to modify a theorem– And make it trivially true
• Case 1:– Specialisation is the trivial algebra
• E.g., all trivial algebras are Abelian• So TM checks whether A (M Triv)
– Flags these as probably uninteresting
• Case 2:– Concept is re-definition of conjecture
• E.g., all Abelian groups are Abelian• So TM checks whether
– (a) M C (b) M C (c) A (M C) – Flags these as probably uninteresting
Experiments
• Difficult to get hold of suitable non-theorems – In the form A C
• TPTP library– Most non-theorems are satisfiable axioms– Others show that 2 sets of axioms not equivalent– Looked in GRP, RNG, FLD, COL
• Found only 9 suitable examples – Please add your non-theorems to the library!!!
• Also produced 89 artificial non-theorems – By taking TPTP theorems and changing:– Axioms, variables, quantifiers, bracketing
Experimental Setup(s)
• Otter – 10 seconds on every run
• MACE– 10 seconds on every run, size 1 to 8
• Preliminary tests showed that– Altering Otter and MACE settings
• Had little effect
• HR settings altered– Theory formation steps (1000 & 3000)– Allowed to use equivalence conjs (& not)
Some Artificial Examples
• Derived from GRP001: – Self inverse groups are Abelian • Removed inverse and associative axioms
– HR re-invented Abelian and TM discarded
• Derived from GRP011-4– Left cancellation law• Identity and inverse axioms removed
– Five cautioned modifications generated• Including one of the form M C
x y (x * (x * y) = y) implies left cancellation– True without mention of associativity (interesting)
Real examples from TPTP
• TM successfully modified 7 of 9– 3 out of 5 in COL (new domain)
• Nice example– First non-theorem from GRP is GRP024-4– comm(x,y) = x*y*x-1y-1
– comm is associative iff all commutators are in the centre of the group
–Mace found no counters• But found four groups supporting this
– TM found that this is true for • Self inverse groups ( a (a * a = id))
My Favourite Example
• RNG031-6• In rings, the following property holds:
w x ((((w*w)*x)*(w*w))=id)– Has some history: JAR paper about them
• Mace: 7 supporting, 6 falsifying• HR: a single specialisation was a pos sub:– ¬( b, c (b*b=c ¬(b+b=c)))
• Tidied up:– In Rings,
• If ( b (b*b = b+b)) then ( w x ((((w*w)*x)*(w*w))=id))• Nice symmetry to it
• Otter proved this
Conclusions & Further Work
• We have shown that ATP can be flexible– Required induction, deduction and calculation
• Integration is so obviously the right direction for automated reasoning
– Demonstrated the effectiveness of TM• On a set of problems from algebra
• Future work:– Possibly apply to verification tasks– “Crack open the conjectures”
• E.g., alter the LHS or RHS to fix it
– Use Progol rather than HR for discrimination• Will be quicker, but produce fewer modifications