Logic for Computer Sciencesak/courses/ilcs/ilcs.pp4.pdf · 2015-07-28 · 4.Objectivity in Logic 5.Formal Logic 6.Formal Logic: Applications 7.Form and Content 8.Facets of Mathematical

HOME PAGE JJ J I IILCS

GO BACK FULL SCREEN CLOSE 1 OF 788 QUIT

Logic for Computer Sciencehttp://www.cse.iitd.ac.in/ sak/courses/ilcs/2015-16.index.html

S. Arun-KumarDepartment of Computer Science and Engineering

I. I. T. Delhi, Hauz Khas, New Delhi 110 016.

July 28, 2015

http://www.cse.iitd.ac.in/~sak

http://www.cse.iitd.ac.in/~sak/courses/ilcs/2015-16.index.html



Contents

0 Background Preliminaries 25

0.1 Motivation for the Study of Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

0.2 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

0.3 Relations and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

0.4 Operations on Binary Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

0.5 Ordering Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

0.6 Partial Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

0.7 Well orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

0.8 Infinite Sets: Countability and Uncountability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

0.9 Induction Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

0.10 Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

0.11 Complete Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

0.12 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

0.13 Simultaneous Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

0.14 Well-ordered Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110




1 Lecture 1: Introduction 116

2 Lecture 2: Propositional Logic Syntax 131

2.1 Propositions in Natural Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

2.2 Translation of Natural Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

2.3 Associativity and Precedence of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

2.4 Rooted Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

3 Lecture 3: Semantics of Propositional Logic 174

4 Lecture 4: Logical and Algebraic Concepts 187

5 Lecture 5: Identities and Normal Forms 198

6 Lecture 6: Tautology Checking 215

7 Lecture 7: Propositional Unsatisfiability 255

7.1 Space Complexity of Propositional Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

7.2 Time Complexity of Propositional Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

8 Lecture 8: Analytic Tableaux 276




9 Lecture 9: Consistency & Completeness 287

10 Lecture 10: The Compactness Theorem 301

11 Lecture 11: Maximally Consistent Sets 311

12 Lecture 12: Formal Theories 327

13 Lecture 13: Proof Theory: Hilbert-style 342

14 Lecture 14: Derived Rules 357

15 Lecture 15: The Hilbert System: Soundness 380

16 Lecture 16: The Hilbert System: Completeness 391

17 Lecture 17: Introduction to Predicate Logic 403

18 Lecture 18: The Semantics of Predicate Logic 426

19 Lecture 19: Substitutions 442

20 Lecture 20: Models, Satisfiability and Validity 457




20.1 Some Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

21 Lecture 21: Structures and Substructures 477

22 Lecture 22: Predicate Logic: Proof Theory 494

23 Lecture 23: Predicate Logic: Proof Theory (Contd.) 516

24 Lecture 24: Existential Quantification 530

25 Lecture 25: Normal Forms 547

26 Lecture 26: Skolemization 568

27 Lecture 27: Substitutions and Instantiations 590

28 Lecture 28: Unification 602

29 Lecture 29: Resolution in FOL 635

30 Lecture 30: More on Resolution in FOL 648

31 Lecture 31: Resolution: Soundness and Completeness 660




32 Lecture 32: Resolution and Tableaux 679

33 Lecture 33: Completeness of Tableaux Method 688

34 Lecture 34: Completeness of the Hilbert System 697

34.1 Model-theoretic and Proof-theoretic Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700

35 Lecture 35: First-Order Theories 711

36 Lecture 36: Towards Logic Programming 729

37 Lecture 37: Verification of Imperative Programs 745

38 Lecture 38: Verification of WHILE Programs 757

39 Lecture 39: Concluding Remarks 775

40 References 787




List of Slides

• Lecture 1: Introduction1. What is Logic?2. Reasoning, Truth and Validity3. Examples4. Objectivity in Logic5. Formal Logic6. Formal Logic: Applications7. Form and Content8. Facets of Mathematical Logic9. Logic and Computer Science

• Lecture 2: Propositional Logic Syntax1. Truth and Falsehood: 12. Truth and Falsehood: 23. Extending the Boolean Algebra4. Table of Truth & Falsehood5. Sums & Products6. Propositional Logic: Syntax7. Propositional Logic: Syntax - 28. Natural Language equivalents9. Some Remarks




10. Associativity and Precedence11. Syntactic Identity12. Abstract Syntax Trees13. Subformulae14. Atoms in a Formula15. Degree of a Formula16. Size of a Formula17. Height of a Formula

• Lecture 3: Semantics of Propositional Logic1. Semantics of Propositional Logic: 12. Semantics of Propositional Logic: 23. Models and Satisfiability4. Example: Abstract Syntax trees5. Tautology, Contradiction, Contingent

• Lecture 4: Logical and Algebraic Concepts1. Logical Consequence: 12. Logical Consequence: 23. Other Theorems4. Logical Implication5. Implication & Equivalence6. Logical Equivalence as a Congruence




• Lecture 5: Identities and Normal Forms1. Adequacy2. Adequacy: Examples3. Functional Completeness4. Duality5. Principle of Duality6. Negation Normal Forms: 17. Negation Normal Forms: 28. Conjunctive Normal Forms9. CNF

• Lecture 6: Tautology Checking1. Arguments2. Arguments: 23. Validity & Falsification4. Translation into propositional Logic5. Atoms in Argument6. The Representation7. Propositional Rendering8. The Strategy9. Checking Tautology

10. Computing the CNF11. Falsifying CNF




• Lecture 7: Propositional Unsatisfiability1. Tautology Checking2. CNFs: Set of Sets of Literals3. Propositional Resolution4. Clean-up5. The Resolution Method6. The Algorithm7. Resolution Examples: Biconditional8. Resolution Examples: Exclusive-Or9. Resolution Refutation: 1

10. Resolution Refutation: 211. Resolvent as Logical Consequence12. Logical Consequence by Refutation

• Lecture 8: Analytic Tableaux1. Against Resolution2. The Analytic Tableau Method3. Basic Tableaux Facts4. Tableaux Rules5. Structure of the Rules6. Tableaux7. Slim Tableaux

• Lecture 9: Consistency & Completeness




1. Tableaux Rules: Restructuring2. Tableaux Rules: 23. Tableau Proofs4. Consistency5. Unsatisfiability6. Hintikka Sets7. Hintikka’s Lemma8. Tableaux and Hintikka sets9. Completeness

• Lecture 10: The Compactness Theorem1. Satisfiability of Infinite Sets2. The Compactness Theorem3. Inconsistency4. Consequences of Compactness

• Lecture 11: Maximally Consistent Sets1. Consistent Sets2. Properties of Finite Character: 13. Properties of Finite Character: 24. Properties of Finite Character: 35. Maximally Consistent Sets6. Lindenbaum’s Theorem




• Lecture 12: Formal Theories1. Introduction to Reasoning2. Proof Systems: 13. Requirements of Proof Systems4. Proof Systems: Desiderata5. Formal Theories6. Formal Language7. Axioms and Inference Rules8. Axiomatic Theories9. Syntax and Decidability

10. A Hilbert-style Proof System11. Rule Patterns

• Lecture 13: Proof Theory: Hilbert-style1. More About Formal Theories2. An Example Proof3. Formal Proofs4. Provability and Formal Proofs5. The Deduction Theorem6. About Formal Proofs

• Lecture 14: Derived Rules1. Simplifying Proofs




2. Derived Rules3. The Sequent Form4. Proof trees in sequent form5. Transitivity of Conditional6. Derived Double Negation Rules7. Derived Operators8. Rules for Derived Operators

• Lecture 15: The Hilbert System: Soundness1. Formal Theory: Issues2. Formal Theory: Incompleteness3. Soundness of Formal Theories4. Soundness of the Hilbert System5. Soundness of the Hilbert System

• Lecture 16: The Hilbert System: Completeness1. Towards Completeness2. Towards Truth-tables3. The Truth-table Lemma4. The Completeness Theorem

• Lecture 17: Introduction to Predicate Logic1. Predicate Logic: Introduction-12. Predicate Logic: Introduction-2




3. Predicate Logic: Introduction-34. Predicate Logic: Symbols5. Predicate Logic: Signatures6. Predicate Logic: Syntax of Terms7. Predicate Logic: Syntax of Formulae8. Precedence Conventions9. Predicates: Abstract Syntax Trees

10. Subterms11. Variables in a Term12. Bound Variables And Scope13. Bound Variables And Scope: Example14. Scope Trees15. Free Variables16. Bound Variables17. Closure

• Lecture 18: The Semantics of Predicate Logic1. Structures2. Notes on Structures3. Infix Convention4. Expansions and Reducts5. Valuations and Interpretations6. Evaluating Terms




7. Coincidence Lemma for Terms8. Variants9. Variant Notation

10. Semantics of Formulae11. Notes on the Semantics

• Lecture 19: Substitutions1. Coincidence Lemma for Formulae2. Substitutions3. Instantiation of Terms4. The Substitution Lemma for Terms5. Admissibility6. Instantiations of Formulae7. The Substitution Lemma for Formulae

• Lecture 20: Models, Satisfiability and Validity1. Satisfiability2. Models and Consistency3. Examples of Models:14. Examples of Models:25. Examples of Models:36. Logical Consequence7. Validity8. Validity of Sets of Formulae




9. Negations of Semantical Concepts

• Lecture 21: Structures and Substructures1. Satisfiability and Expansions2. Distinguishability3. Evaluations under Different Structures4. Isomorphic Structures5. The Isomorphism Lemma6. Substructures7. Substructure Examples8. Quantifier-free Formulae9. Lemma on Quantifier-free Formulae

10. Universal and Existential Formulae11. The Substructure Lemma

• Lecture 22: Predicate Logic: Proof Theory1. Proof Theory: First-Order Logic2. Proof Rules: Hilbert-Style3. The Mortality of Socrates4. The Mortality of the Greeks5. Faulty Proof:26. A Correct Proof7. The Sequent Forms8. The Case of Equality




9. Semantics of Equality10. Axioms for Equality11. Symmetry and Transitivity12. Symmetry of Equality13. Transitivity of Equality

• Lecture 23: Predicate Logic: Proof Theory (Contd.)1. Alpha Conversion2. The Deduction Theorem for Predicate Calculus3. Useful Corollaries4. Soundness of Predicate Calculus5. Soundness of The Hilbert System

• Lecture 24: Existential Quantification1. Existential Generalisation2. Existential Elimination3. Remarks on Existential Elimination4. Restrictions on Existential Elimination5. Equivalence of Proofs

• Lecture 25: Normal Forms1. Natural Deduction: 62. Moving Quantifiers3. Quantifier Movement




4. More on Quantifier Movement5. Prenex Normal Forms6. The Prenex Normal Form Theorem7. Prenex Conjunctive Normal Form8. The Herbrand Algebra9. Terms in a Herbrand Algebra

10. Herbrand Interpretations11. Herbrand Models12. Ground Quantifier-free Formulae

• Lecture 26: Skolemization1. Skolemization2. Skolem Normal Forms3. SCNF4. Ground Instance5. Herbrand’s Theorem6. The Herbrand Tree of Interpretations7. Compactness of Sets of Ground Formulae8. Compactness of Closed Formulae9. The Lowenheim-Skolem Theorem

• Lecture 27: Substitutions and Instantiations1. Substitutions Revisited2. Some Simple Facts




3. Ground Substitutions4. Composition of Substitutions5. Substitutions: A Monoid

• Lecture 28: Unification1. Unifiability2. Unification Examples:13. Unification Examples:24. Generality of Unifiers5. Generality: Facts6. Most General Unifiers7. More on Positions8. Disagreement Set9. Example: Disagreement 1

10. Example: Occurs Check11. Example: Disagreement 312. Example: Disagreement 413. Disagreement and Unifiability14. The Unification Theorem

• Lecture 29: Resolution in FOL1. Recapitulation2. SCNFs and Models3. SCNFs and Unsatisfiability




4. Representing SCNFs5. Clauses: Terminology6. Clauses: Ground Instances7. Facts about Clauses8. Clauses: Models9. Clauses: Herbrand’s Theorem

10. Resolution in FOL

• Lecture 30: More on Resolution in FOL1. Standardizing Variables Apart2. Factoring3. Example: 14. Example: 25. Refutation

• Lecture 31: Resolution: Soundness and Completeness1. Soundness of FOL Resolution2. Ground Clauses3. The Lifting Lemma4. Lifting Lemma: Figure5. Completeness of Resolution Refutation: 16. Completeness of Resolution Refutation: 27. Completeness of Resolution Refutation: 3




• Lecture 32: Resolution and Tableaux1. FOL: Tableaux2. FOL: Tableaux Rules3. FOL Tableaux: Example 14. First-Order Tableaux5. FOL Tableaux: Example 2

• Lecture 33: Completeness of Tableaux Method1. First-order Hintikka Sets2. Hintikka’s Lemma for FOL3. First-order tableaux and Hintikka sets4. Soundness of First-order Tableaux5. Completeness of First-order Tableaux

• Lecture 34: Completeness of the Hilbert System1. Deductive Consistency2. Models of Deductively Consistent Sets3. Deductive Completeness4. The Completeness Theorem

• Lecture 35: First-Order Theories1. (Simple) Directed Graphs2. (Simple) Undirected Graphs3. Irreflexive Partial Orderings




4. Irreflexive Linear Orderings5. (Reflexive) Preorders6. (Reflexive) Partial Orderings7. (Reflexive) Linear Orderings8. Equivalence Relations9. Peano’s Postulates

10. The Theory of The Naturals11. Notes and Explanations12. Finite Models of Arithmetic13. A Non-standard Model of Arithmetic14. Z-Chains

• Lecture 36: Towards Logic Programming1. Reversing the Arrow2. Horn Clauses3. Goal clauses4. Logic Programs5. Sorting in Logic6. Prolog: Selection Sort7. Prolog: Merge Sort8. Prolog: Quick Sort9. Prolog: SEND+MORE=MONEY

10. Prolog: Naturals




• Lecture 37: Verification of Imperative Programs1. The WHILE Programming Language2. Programs As State Transformers3. The Semantics of WHILE4. Programs As Predicate Transformers5. Correctness Assertions6. Total Correctness of Programs7. Examples: Factorial 18. Examples: Factorial 2

• Lecture 38: Verification of WHILE Programs1. Proof Rule: Epsilon2. Proof Rule: Assignment3. Proof Rule: Composition4. Proof Rule: The Conditional5. Proof Rule: The While Loop6. The Consequence Rule7. Proof Rules for Partial Correctness8. Example: Factorial 19. Towards Total Correctness

10. Termination and Total Correctness11. Example: Factorial 212. Notes on Example: Factorial




13. Example: Factorial 2 Made Complete14. An Open Problem: Collatz

• Lecture 39: Concluding Remarks1. Summary2. The Limitations of Predicate Logic3. Sortedness4. Many-Sorted Logic: Symbols5. Many-Sorted Signatures6. Many-Sorted Signature: Terms7. Many-Sorted Predicate Logic8. Reductions




0. Background Preliminaries

al-go-rism n. [ME algorsme<OFr.< Med.Lat. algorismus, after Muhammad ibn-Musa Al-Kharzimi (780-850?).] The Arabic system of numeration: DECIMAL SYSTEM.

al-go-rithm n [Var. of ALGORISM.] Math. A mathematical rule or procedure for solving aproblem.4word history: Algorithm originated as a varaiant spelling of algorism. The spelling wasprobably influenced by the word aruthmetic or its Greek source arithm, ”number”. With thedevelopment of sophisticated mechanical computing devices in the 20th century, however,algorithm was adopted as a convenient word for a recursive mathematical procedure, thecomputer’s stock in trade. Algorithm has ceased to be used as a variant form of the olderword.

Webster’s II New Riverside University Dictionary 1984.

0.1. Motivation for the Study of Logic

In the early years of this century symbolic or formal logic became quite popular with philosophersand mathematicicans because they were interested in the concept of what constitutes a correct proof




in mathematics. Over the centuries mathematicians had pronounced various mathematical proofsas correct which were later disproved by other mathematicians. The whole concept of logic thenhinged upon what is a correct argument as opposed to a wrong (or faulty) one. This has been amplyillustrated by the number of so-called proofs that have come up for Euclid’s parallel postulate and forFermat’s last theorem. There have invariably been “bugs” (a term popularised by computer scientistsfor the faults in a program) which were often very hard to detect and it was necessary thereforeto find infallible methods of proof. For centuries (dating back at least to Plato and Aristotle) norigorous formulation was attempted to capture the notion of a correct argument which would guidethe development of all mathematics.

The early logicians of the nineteenth and twentieth centuries hoped to establish formal logic as afoundation for mathematics, though that never really happened. But mathematics does rest on onefirm foundation, namely set theory. But Set theory itself has been expressed in first order logic.What really needed to be answered were questions relating to the automation or mechanizabilityof proofs. These questions are very relevant and important for the development of present-daycomputer science and form the basis of many developments in automatic theorem proving. DavidHilbert asked the important question, as to whether all mathematics, if reduced to statements ofsymbolic logic, can be derived by a machine. Can the act of constructing a proof be reduced to themanipulation of statements in symbolic logic? Logic enabled mathematicians to point out why an




alleged proof is wrong, or where in the proof, the reasoning has been faulty. A large part of the creditfor this achievement must go to the fact that by symbolising arguments rather than writing them outin some natural language (which is fraught with ambiguity), checking the correctness of a proofbecomes a much more viable task. Of course, trying to symbolise the whole of mathematics couldbe disastrous as then it would become quite impossible to even read and understand mathematics,since what is presented usually as a one page proof could run into several pages. But at least inprinciple it can be done.

Since the latter half of the twentieth century logic has been used in computer science for variouspurposes ranging from program specification and verification to theorem-proving. Initially its usewas restricted to merely specifying programs and reasoning about their implementations. This isexemplified in the some fairly elegant research on the development of correct programs using first-order logic in such calculi such as the weakest-precondition calculus of Dijkstra. A method calledHoare Logic which combines first-order logic sentences and program phrases into a specificationand reasoning mechanism is also quite useful in the development of small programs. Logic in thisform has also been used to specify the meanings of some programming languages, notably Pascal.

The close link between logic as a formal system and computer-based theorem proving is provingto be very useful especially where there are a large number of cases (following certain patterns)




to be analysed and where quite often there are routine proof techniques available which are moreeasily and accurately performed by therorem-provers than by humans. The case of the four-colourtheorem which until fairly recently remained a unproved conjecture is an instance of how humaningenuity and creativity may be used to divide up proof into a few thousand cases and where ma-chines may be used to perform routine checks on the individual cases. Another use of computersin theorem-proving or model-checking is the verification of the design of large circuits before achip is fabricated. Analysing circuits with a billion transistors in them is at best error-prone and atworst a drudgery that few humans would like to do. Such analysis and results are best performed bymachines using theorem proving techniques or model-checking techniques.

A powerful programming paradigm called declarative programming has evolved since the late sev-enties and has found several applications in computer science and artificial intelligence. Most pro-grammers using this logical paradigm use a language called Prolog which is an implemented formof logic1. More recently computer scientists are working on a form of logic called constraint logicprogramming.

In the rest of this chapter we will discuss sets, relations, functions. Though most of these topics arecovered in the high school curriculum this section also establishes the notational conventions that

1actually a subset of logic called Horn-clause logic




will be used throughout. Even a confident reader may wish to browse this section to get familiarwith the notation.

0.2. Sets

A set is a collection of distinct objects. The class of CS253 is a set. So is the group of all first yearstudents at IITD. We will use the notation a, b, c to denote the collection of the objects a, b andc. The elements in a set are not ordered in any fashion. Thus the set a, b, c is the same as the setb, a, c. Further, repetitions of elements in a set do not change it in any way. Two sets are equal ifthey contain exactly the same elements. Hence the sets a, b, c, a, b, c, a, b, a, c, c, b, a, c areall equal.

We can describe a set either by enumerating all the elements of the set or by stating the proper-ties that uniquely characterize the elements of the set. Thus, the set of all even positive integersnot larger than 10 can be described either as S = 2, 4, 6, 8, 10 or, equivalently, as S = x |x is an even positive integer not larger than 10

A set can have another set as one of its elements. For example, the set A = a, b, c, d containstwo elements a, b, c and d; and the first element is itself a set. We will use the notation x ∈ S to




denote that x is an element of (or belongs to) the set S.

A set A is a subset of another set B, denoted as A ⊆ B, if x ∈ B whenever x ∈ A.

An empty set is one which contains no elements and we will denote it with the symbol ∅. Forexample, let S be the set of all students who fail this course. S might turn out to be empty (hopefully;if everybody studies hard). By definition, the empty set ∅ is a subset of all sets. We will also assumean Universe of discourse U, and every set that we will consider is a subset of U. Thus we have

1. ∅ ⊆ A for any set A

2. A ⊆ U for any set A

The union of two sets A and B, denoted A ∪ B, is the set whose elements are exactly the elementsof either A or B (or both). The intersection of two sets A and B, denoted A ∩ B, is the set whoseelements are exactly the elements that belong to both A and B. The difference of B from A, denotedA− B, is the set of all elements of A that do not belong to B. The complement of A, denoted ∼ A

is the difference of A from the universe U. Thus, we have

1. A ∪B = x | (x ∈ A) or (x ∈ B)2. A ∩B = x | (x ∈ A) and (x ∈ B)




3. A−B = x | (x ∈ A) and (x 6∈ B)4. ∼ A = U− A

We also have the following named identities that hold for all sets A, B and C.

Basic properties of set union.

1. (A ∪B) ∪ C = A ∪ (B ∪ C) Associativity

2. A ∪ φ = A Identity

3. A ∪ U = U Zero

4. A ∪B = B ∪ A Commutativity

5. A ∪ A = A Idempotence

Basic properties of set intersection

1. (A ∩B) ∩ C = A ∩ (B ∩ C) Associativity

2. A ∩ U = A Identity

3. A ∩ φ = φ Zero




4. A ∩B = B ∩ A Commutativity

5. A ∩ A = A Idempotence

Other properties

1. A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C) Distributivity of ∩ over ∪

2. A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C) Distributivity of ∪ over ∩

3. ∼ (A ∪B) =∼ A∩ ∼ B De Morgan’s law ∼ ∪

4. ∼ (A ∩B) =∼ A∪ ∼ B De Morgan’s law ∼ ∩

5. A ∩ (∼ A ∪B) = A ∩B Absorption ∪

6. A ∪ (∼ A ∩B) = A ∪B Absorption ∩

The reader is encouraged to come up with properties of set difference and the complementationoperations.

We will use the following notation to denote some standard sets:

The empty set: ∅




The Universe: U

The set of Natural Numbers: N = 0, 1, 2, . . .. We will include 0 in the set of Natural numbers.After all, it is quite natural to score a 0 in an examination!

The set of positive integers: P = 1, 2, 3, . . .

The two-element set: 2 = 0, 1. More generally for any natural number nwe let n = 0, 1, . . . , n−1 the set of all naturals less than n. By convention ∅ is the set of all naturals less than 0.

The set of integers: Z = . . . ,−2,−1, 0, 1, 2, . . .

The set of rational numbers: Q

The set of real numbers: R

The Boolean set: B = false, true

The Powerset of a set A: 2A is the set of all subsets of the set A.




0.3. Relations and Functions

The Cartesian product of two sets A and B, denoted by A×B, is the set of all ordered pairs (a, b)

such that a ∈ A and b ∈ B. Thus,

A×B = (a, b) | a ∈ A and b ∈ B

Given another set C we may form the following different kinds of cartesian products (which are notat all the same!).

(A×B)× C = ((a, b), c) | a ∈ A, b ∈ B and c ∈ C

A× (B × C) = (a, (b, c)) | a ∈ A, b ∈ B and c ∈ C

A×B × C = (a, b, c) | a ∈ A, b ∈ B and c ∈ C

The last cartesian product gives the construction of tuples. Elements of the set A1 × A2 × · · · × An

for given sets A1, A2, . . . , An are called ordered n-tuples.




An is the set of all ordered n-tuples (a1, a2, . . . , an) such that ai ∈ A for all i. i.e.,

An = A× A× · · · × A︸︷︷︸n times

A binary relation R from A to B is a subset of A × B. It is a characterization of the intuitivenotion that some of the elements of A are related to some of the elements of B. We also use the infixnotation aRb to mean (a, b) ∈ R. When A and B are the same set, we say R is a binary relationon A. Familiar binary relations from N to N are =, 6=, <, ≤, >, ≥. Thus the elements of theset (0, 0), (0, 1), (0, 2), . . . , (1, 1), (1, 2), . . . are all members of the relation ≤ which is a subset ofN× N.

In general, an n-ary relation among the sets A1, A2, . . . , An is a subset of the set A1×A2×· · ·×An.

Definition 0.1 LetR ⊆ A×B be a binary relation from A to B. Then

1. For any set A′ ⊆ A the image of A′ underR is the set defined by

R(A′) = b ∈ B | aRb for some a ∈ A′

2. For every subset B′ ⊆ B the pre-image of B′ underR is the set defined by

R−1(B′) = a ∈ A | aRb for some b ∈ B′




3. R is onto (or surjective) with respect to A and B ifR(A) = B.

4. R is total with respect to A and B ifR−1(B) = A.

5. R is one-to-one (or injective) with respect to A and B if for every b ∈ B there is at most onea ∈ A such that (a, b) ∈ R.

6. R is a partial function from A to B, usually denoted R : A B, if for every a ∈ A there isat most one b ∈ B such that (a, b) ∈ R. R is a total function from A to B, usually denotedR : A −→ B ifR is a partial function from A to B and is total. Notice that every total functionis also a partial function. A is called the domain and B the co-domain. The range of thefunction isR(A).

7. R is a one-to-one correspondence (or bijection) if it is an injective and surjective total function.

Notes.

1. Given a (partial or total) function f : A B, the binary relation it corresponds to is called thegraph of the function and graph(f ) = (a, b) ∈ A×B | f (a) = b.

2. A binary relationR ⊆ A×B may also be thought of as a total functionR : 2A −→ 2B. Likewise

R−1 the converse of the relation R, may be thought of as a total function R−1 : 2B −→ 2A (c.f.




parts 1 and 2 of definition 0.1 where relation symbol has been “overloaded”).

3. Similarly every partial function f : A B may be “overloaded” to mean the total functionf : 2A −→ 2

B, which yields the image of A′ for each A′ ⊆ A. Likewise even though theconverse (see part 2 of definition 0.6) of graph(f ) may not be a function, the total inversefunction f−1 : 2B −→ 2

A is well defined and for each B′ ⊆ B, yields the pre-image of B′.

Notation. Let f be a total function from set A to set B. Then

• f : A1-1−→ B will denote that f is injective,

• f : A−−→onto

B will denote that f is surjective, and

• f : A1-1−−→

ontoB will denote that f is bijective,

Example 0.2 The following are some examples of familiar binary relations along with their proper-ties.

1. The ≤ relation on N is a relation from N to N which is total and onto. That is, both the imageand pre-image of ≤ under N are N itself. What are image and the pre-image respectively of therelation <?




2. The binary relation which associates key sequences from a computer keyboard with their respec-tive 8-bit ASCII codes is an example of a relation which is total and injective.

3. The binary relation which associates 7-bit ASCII codes with their corresponding ASCII charac-ters is a bijection.

The figures 1, 2, 3, 4 and 5 respectively illustrate the concepts of partial, injective, surjective, bijec-tive and inverse of a bijective function on finite sets. The directed arrows go from elements in thedomain to their images in the codomain.

We may equivalently define partial and total functions as follows.

Definition 0.3 A function (or a total function) f from A to B is a binary relation f ⊆ A×B suchthat for every element a ∈ A there is a unique element b ∈ B so that (a, b) ∈ f (usually denotedf (a) = b and sometimes f : a 7→ b). We will use the notationR : A→ B to denote a functionR fromA to B. The set A is called the domain of the function R and the set B is called the co-domain ofthe function R. The range of a function R : A→ B is the set b ∈ B | for some a ∈ A, R(a) = b.A partial function f from A to B, denoted f : A B is a total function from some subset of A tothe set B. Clearly every total function is also a partial function.




b

a

cd

ef

g

x

y

z

u

v

A B

p

Figure 1: A partial function (Why is it partial?)

The word “function” unless otherwise specified is taken to mean a “total function”. Some familiarexamples of partial and total functions are

1. + and × (addition and multiplication) on the natural numbers are total functions of the typef : N× N→ N

2. − (subtraction) on the natural numbers is a partial function of the type f : N× N N.

3. div and mod are total functions of the type f : N× P→ N. If a = q ∗ b+ r such that 0 ≤ r < b




b

a

c

e

g

x

y

u

v

A B

f

d

wz

t

Figure 2: An injective function (Why is it injective?)

and a, b, q, r ∈ N then the functions div and mod are defined as div(a, b) = q and mod(a, b) = r.We will often write these binary functions as a ∗ b, a div b, a mod b etc. Note that div and modare also partial functions of the type f : N× N N.

4. The binary relations =, 6=, <, ≤, >, ≥ may also be thought of as functions of the typef : N× N→ B where B = false, true.

Definition 0.4 Given a set A, a finite sequence of length n ≥ 0 of elements from A, denoted ~a, is




b

a

cd

e

g

x

y

z

u

v

A B

g

Figure 3: A surjective function (Why is it surjective?)

a (total) function of the type ~a : 1, 2, . . . , n → A. We normally denote such a sequence of lengthn by [a1, a2, . . . , an]. Alternatively, ~a may be regarded as a total function from 0, . . . , n − 1 to Aand may be denoted by [a0, a2, . . . , an−1]. The empty sequence, denoted [], is also such a function[] : ∅ → A and denotes a sequence of length 0.

It is very common in computer science to distinguish between the notion of a sequence and that of astring or a word.




b

a

c

e

g

x

y

u

v

A B

d

wz

h

Figure 4: An bijective function (Why is it bijective?)

Definition 0.5 An alphabet is a finite set of symbols also called letters. Any finite sequence ofletters from an alphabet is called a string or a word. A string of length n ∈ N is usually writtena1a2 . . . an or “a1a2 . . . an”, where each ai ∈ A, 1 ≤ i ≤ n. The unique empty string (of length 0) isusually denoted ε and the operation of juxtaposing two strings s and t to form a new string is called(con)catenation.

It is quite clear that there exists a simple bijection from the set An (which is the set of all n-tuples ofelements from the set A) and the set of all sequences of length n of elements from A. We will often




b

a

c

e

g

x

y

u

v

A B

d

wz

h−1

Figure 5: The inverse of the bijective function in Fig 4(Is it bijective?)

identify the two as being the same set even though they are actually different by definition2. The setof all finite sequences of elements from A is denoted A∗, where

A∗ =⋃n≥0

An

2In a programming language like ML, the difference is evident from the notation and the constructor operations for tuples and lists




The set of all non-empty sequences of elements from A is denoted A+ and is defined as

A+ =⋃n>0

An

An infinite sequence of elements from A is a total function from P to A. The set of all such infinitesequences is denoted Aω.

0.4. Operations on Binary Relations

Definition 0.6

1. Given a set A, the identity relation over A, denoted IA, is the set (a, a) | a ∈ A.2. Given a binary relation R from A to B, the converse of R, denoted R−1 is the relation from B

to A defined asR−1 = (b, a) | (a, b) ∈ R.3. Given binary relations R ⊆ A × B and S ⊆ B × C, the composition of R with S is denotedR;S and defined asR;S = (a, c) | aRb and bSc, for some b ∈ B.

Note that unlike in the case of functions (where for any function f : A −→ B its inverse f−1 :

B −→ A may not always be defined), the converse of a relation is always defined. Given functions




(whether partial or total) f : A B and g : B C, their composition is the function gf : A C

defined simply as the relational composition graph(f ); graph(g). Hence (g f )(a) = g(f (a)).

The following is an important theorem with various applications in section 0.8.

Theorem 0.7 (Schroeder-Bernstein Theorem) Let A and B be sets and let f : A1-1−→ B and

g : B1-1−→ A be injective functions. Then there exists a bijection between A and B.

Proof: Since f and g are both injective (1-1), they are both total functions, but their inverses maynot be total. By injectivity, for any a ∈ A, f (a) = b implies that b cannot be the image under f ofany other member of A. Likewise for any b ∈ B, g(b) ∈ A and for every other b′ ∈ B we haveg(b′) 6= g(b). Hence f−1 : B A and g−1 : A B are both partial functions.

For any a0 ∈ Awe define the origin of a0 as a0 itself if g−1(a0) is undefined i.e. if a0 is not the imageof any b ∈ B under g. (Likewise for any b0 ∈ B, the origin of b0 is b0 itself if f−1(b0) is undefined).Otherwise g−1(a0) = b1 for a unique b1 ∈ B. Now consider the maximal (possibly infinite) sequence




of elementsa0 ∈ A,

g−1(a0) = b1 ∈ B,

f−1(b1) = a2 ∈ A,

g−1(a2) = b3 ∈ B,... ... ... ,

f−1(b2k−1) = a2k ∈ A,

g−1(a2k) = b2k+1 ∈ B,... ... ...

such that for each k > 0, a2k = f−1(b2k−1) and b2k+1 = g−1(b2k). We then have the following casesfor each a0 ∈ A.

• Case AA. a2m is the origin of a0 for some m ≥ 0. That is, the sequence a0, b1, a2, b3, . . . , a2m isfinite and g−1(a2m) is undefined. In this case a2m is the origin of a0 and a0 ∈ AA.

• Case AB. b2m+1 is the origin of a0 for some m ≥ 0. That is, the sequence a0, b1, a2, b3, . . . , a2m, b2m+1

is finite and f−1(b2m+1) is undefined. Then b2m+1 is the origin of a0 and a0 ∈ AB.

• Case AU . The origin of a0 is undefined. That is, the sequence a0, b1, a2, b3, . . . , a2m, b2m+1, . . . isinfinite. Then a0 ∈ AU .




Hence A may be partitioned into three (mutually disjoint) sets AA, AB, AU depending upon theorigins of the elements of A. (Analogously, B may be partitioned into BA, BB and BU ).

Now we may define the total function h : A −→ B such that

h(a) =

f (a) if a ∈ AA ∪ AU

g−1(a) if a ∈ AB‘

Claim. h : A1-1−−→

ontoB i.e. h is a bijection from A to B.

` The proof of the claim is easy and is left to the interested reader. In fact, we may showthat the following hold

h(AU) = BU , h−1(BU) = AU

h(AA) = BA , h−1(BA) = AB

h(AB) = BB , h−1(BB) = AB

a

QED




0.5. Ordering Relations

We may define the n-fold composition of a relationR on a set A by induction as follows

R0 = IARn+1 = Rn;R

We may combine these n-fold compositions to yield the reflexive-transitive closure of R, denotedR, as the relation

R∗ =⋃n≥0

Rn

Sometimes it is also useful to consider merely the transitive closureR+ ofR which is defined as

R+ =⋃n>0

Rn

Definition 0.8 A binary relationR on a set A is

1. reflexive if and only if IA ⊆ R;

2. irreflexive if and only if IA ∩R = ∅;




3. symmetric if and only ifR = R−1;

4. asymmetric if and only ifR∩R−1 = ∅;

5. antisymmetric if and only if for all a and b, a 6= b, (a, b) ∈ R implies (b, a) 6∈ R.3

6. transitive if and only if for all a, b, c ∈ A, (a, b), (b, c) ∈ R implies (a, c) ∈ R.

7. connected if and only if for all a, b ∈ A, if a 6= b then aRb or bRa.

Given any relationR on a set A, it is easy to see thatR∗ is both reflexive and transitive.

Example 0.9

1. The edge relation on an undirected graph is an example of a symmetric relation.

2. In any directed acyclic graph the edge relation is asymmetric.

3. Consider the reachability relation on a directed graph defined as: A pair of vertices (A,B) is inthe reachability relation, if either A = B or there exists a vertex C such that both (A,C) and(C,B) are in the reachability relation. The reachability relation is the reflexive transitive closureof the edge relation.

3An equivalent definition used in most books is: R is antisymmetric if and only if (a, b), (b, a) ∈ R implies a = b.




4. The reachability relation on directed graphs is also an example of a relation that need not beeither symmetric or asymmetric. The relation need not be antisymmetric either.

0.6. Partial Orders

Definition 0.10 A binary relationR on a set A is

1. a preorder if it is reflexive and transitive;

2. a strict preorder if it is irreflexive and transitive;

3. a partial order if is an antisymmetric preorder;

4. a strict partial order if it is irreflexive, asymmetric and transitive;

5. a linear order4 if it is a connected partial order;

6. a strict linear order if it is connected, irreflexive and transitive;

7. an equivalence if it is reflexive, symmetric and transitive.4also called total order




Definition 0.11 A partially ordered set, or poset 〈A,≤〉 consists of a set A together with a partialorder relation ≤ on A.

Fact 0.12 If 〈A,≤〉 is a poset, then so is 〈A,≥〉 where ≥=≤−1.

Notation. Given a poset 〈A,≤〉 and a, b ∈ A, we sometimes write

• b ≥ a to mean a ≤ b,

• a < b to mean a ≤ b and a 6= b and

• b > a to mean a < b.

Fact 0.13 If 〈A,≤〉 is a poset, then < and > are strict partial orders.

For any set A (empty or non-empty) we have that 〈2A,⊆〉 is also a poset and in fact, the partialordering relation ≤ on A can be characterised (upto isomorphism) by the subset relation.

Definition 0.14 Given two posets 〈A,≤A〉 and 〈B,≤B〉, a function f : A −→ B is said to be order-preserving if and only if for all a, a′ ∈ A, a ≤A a′ implies f (a) ≤B f (a′). The two posets are said




to be (order-) isomorphic if there exists an order-preserving bijection between them. We denote thisfact by 〈A,≤A〉 ∼= 〈B,≤B〉.

Notice that if f : A −→ B is a bijection then so is f−1 : B −→ A.

Lemma 0.15 For each poset 〈A,≤〉 there exists a set A ⊆ 2A such that 〈A,≤〉 ∼= 〈A ,⊆〉.

Proof: For each x ∈ A let Ax = a ∈ A | a ≤ x. Define the set A = Ax | x ∈ A ⊆ 2A and

the function f : A −→ A such that f (x) = Ax for each x ∈ A. It is easy to see that f is bijectiveand order-preserving i.e. for all x, y ∈ A, x ≤ y if and only if Ax ⊆ Ay. QED

0.7. Well orders

We discuss well-orders since an mportat induction principle depends upon the notion of a well-ordering and generalises the principle of mathematical induction.

Definition 0.16 Let 〈A,≤〉 be a poset and B ⊆ A. An element b ∈ B is said to be minimal if thereexists no a ∈ B such that a < b. A poset 〈A,≤〉 is called well-founded if every nonempty subset ofA has a minimal element. Equivalently we say that ≤ on A is a well-founded.




Lemma 0.17 A poset 〈A,≤〉 is well-founded if and only if there is no subset ai ∈ A | i ≥ 0 suchthat ai > ai+1 for all i ≥ 0.

Proof:

(⇒). Assume 〈A,≤〉 is well-founded and there is a subset A′ = ai ∈ A | i ≥ 0, ai > ai+1 ⊆ A.Clearly A′ contains no minimal element, which is a contradiction.

(⇐). Assume there is no subset A′ = ai ∈ A | i ≥ 0, ai > ai+1 ⊆ A. If 〈A,≤〉 is notwell-founded, there exists a nonempty subset B ⊆ A which has no minimal element. Considerany b0 ∈ B. Since b0 is not minimal there exists b1 ∈ B such that b0 > b1. Again b1 is notminimal, so there must be a b2 ∈ B with b1 > b2. Proceeding in this fashion we find that for eachbi < · · · < b1 < b0, there exists a bi+1 ∈ B such that bi+1 < bi < · · · < b1 < b0. We may thusconstruct a set B′ = bi ∈ B | i ≥ 0, bi > bi+1 ⊆ B ⊆ A which contradicts the assumptionthat there is no such subset.

QED

The set B′ = bi ∈ B | i ≥ 0, bi > bi+1 is an example of an “infinite descending chain”

· · · < bi+1 < bi < · · · < b1 < b0




We usually say that a well-founded set has no “infinite descending chain”.

0.8. Infinite Sets: Countability and Uncountability

Definition 0.18 A set A is finite if it can be placed in bijection with a set n = 0, . . . , n − 1 forsome n ∈ N.

The above definition embodies the usual notion of counting. In particular note that the empty set ∅is finite since it can be placed in bijection with itself.

Definition 0.19 A set A is called infinite if there exists a bijection between A and some propersubset of itself.

This definition begs the question, “If a set is not infinite, then is it necessarily finite?”. It turns outthat indeed it is. Further it is also true that if a set is not finite then it can be placed in bijection witha proper subset of itself. But rigorous proofs of these statements are beyond the scope of this courseand hence we shall not pursue them.

Example 0.20 We give appropriate bijections to show that various sets are infinite. In each case,




note that the codomain of the bijection is a proper subset of the domain.

1. The set N of natural numbers is infinite because we can define the 1-1 correspondence p :

N 1-1−−→onto

P, with p(m)df= m + 1.

2. The set E of even natural numbers is infinite because we have the bijection e : E1-1−−→

ontoF where

F is the set of all multiples of 4.

3. The set of odd natural numbers is infinite. (Why?)

4. The set Z of integers is infinite because we have the following bijection z : Z 1-1−−→onto

N by whichthe negative integers have unique images among the odd numbers and the non-negative integershave unique images among the even numbers. More specifically,

z(m) =

2m if m ∈ N−2m− 1 otherwise

Example 0.21 The set R of reals is infinite. We outline the proof by considering the nonempty openinterval (a, b) = p | a < p < b and use figure 6 as a guide to understand the mapping.

Take any line-segment AB of length b − a 6= 0 and “bend” it into the semi-circle_

A′B′ and placeit tangent to the x-axis at the point (0, 0) (as shown in the figure). The bijection between the points




0

P’

P’’

−1 1p"

(0,r)A’ C B’

x

y

Figure 6: Bijection between the arc A′B′ and the real line

on the semi-circle and the real numbers p, a < p < b is “obvious”. This semicircle has a radius

r =b− aπ

. The centre C of this semi-circle is then located at the point (0, r) on the 2-dimensionalplane.

Consider an arbitrary point P ′ on the semi-circle, which corresponds to a real number p, a < p < b.The ray

−−→CP ′ intersects the x-axis at some point P ′′ which has the coordinates (p′′, 0). Since A′ 6=

P ′ 6= B′, the ray cannot be parallel to the x-axis). Similarly from every point P ′′ on the x-axisthere exists a unique point P ′ on the semi-circle such that C, P ′ and P ′′ are collinear. Each pointP ′ such that A′ 6= P ′ 6= B′ on this semi-circle corresponds exactly to a unique real number p inthe open interval (a, b) and vice-versa. Hence there exists a 1-1 correspondence between the pointson the semicircle (excluding the end-points of the semi-circle) and those on the x-axis. Let p′′ be




the x-coordinate of the point P ′′. Since the composition of bijections is a bijection (see exercises),we may compose all these bijections to obtain a 1-1 correspondence between each p in the interval(a, b) and the real numbers.

Definition 0.22 An infinite set is said to be countable (or countably infinite) if it can be placed inbijection with the set P Otherwise, it is said to be uncountable.

The above definition essentially says that a countably infinite set may be enumerated by selecting aunique “first element”, a unique “second” element and so on. Countability of an infinite set thereforeimplies that for any positive integer n, it should be possible to obtain the unique designated n-thelement fromt he set and also for any element in the set, it should be possible to obtain its positionin the enumeration.

Fact 0.23 The following are easy to prove.

1. An infinite set A is countable if and only if there is a bijection between A and N.

2. Every infinite subset of N is countable.

3. If A is a finite set and B is a countable set, then A ∪B is countable.




4. If A and B are countable sets, then A ∪B is also countable.

Theorem 0.24 N2 is a countable set.

Proof:

We show thatN2 is countably infinite by devising a way to order the elements ofN2 which guaranteesthat there is indeed a 1-1 correspondence. For instance, an obvious ordering such as

(0, 0) (0, 1) (0, 2) (0, 3) . . .

(1, 0) (1, 1) (1, 2) (1, 3) . . .

(2, 0) (2, 1) (2, 2) (2, 3) . . .... . . . . . .

is not a 1-1 correspondence because we cannot answer the following questions with (unique) an-swers.

1. What is the n-th element in the ordering?

2. What is the position in the ordering of the pair (a, b) for arbitrary naturals a and b?




0 1

2

3

4

5

6

7

8

9n

y

x

(a,b)

(a−1, b+1)

(a, b+1)

D

D

D

D

D

D

0

1

2

3

4

i

Figure 7: Counting “lattice-points” on the “diagonals”




So it is necessary to construct a more rigorous and ingenious device to ensure a bijection. So weconsider the ordering implicitly defined in figure 7. By traversing the blue rays

−→D0,−→D1,−→D2, . . . in

order, we get an obvious ordering on the elements of N2. However it should be possible to giveunique answers to the above questions.

Claim f : N2 −→ N defined by f (a, b) =(a + b)(a + b + 1) + 2b

2is the required bijection.

Proof outline: The function f defines essentially the traversal of the rays−→D0,−→D1,−→D2, . . . in order

as we shall prove. It is easy to verify that−→D0 contains only the pair (0, 0) and f (0, 0) = 0. Now

consider any pair (a, b) 6= (0, 0). If (a, b) lies on the ray−→Di, then it is clear that i = a + b. Now

consider all the pairs that lie on the rays−→D0,−→D1, . . . ,

−−→Di−1

5

The number of such pairs is given by the “triangular number”

i + (i− 1) + (i− 2) + · · · + 1 =i(i + 1)

2

Since we started counting from 0 this number is also the value of the lattice point (i, 0) under thefunction f . This brings us to the starting point of the ray Di and after crossing b lattice points along

5Under the usual (x, y) coordinate system, these are all the lattice points on and inside the right triangle defined by the three points (i − 1, 0), (0, 0) and (0, i − 1). A lattice point in the (x, y)-plane is pointwhose x− and y− coordinates are both integers.




the ray Di we arrive at the point (a, b). Hence

f (a, b) =i(i + 1)

2+ b

=(a + b)(a + b + 1) + 2b

2

We leave it as an exercise to the reader to define the inverse of this function. (Hint: Use “triangularnumbers”!) QED

Theorem 0.25 The countable union of countable sets is countable, i.e. given a family A = Ai |Ai is countable, i ∈ N of countable sets, their union A∞ =

⋃i∈NAi is also countable.

Proof: For simplicity we assume that the sets are all pairwise disjoint i.e. Ai ∩ Aj = ∅ for eachi 6= j. Hence for each element a ∈ A∞, there exists a unique i ∈ N such that a ∈ Ai. This impliesthere exists a bijection h : A∞

1-1−−→onto

(i, a) | a ∈ Ai, i ∈ N. Since each Ai is countable, there

exists a bijection fi : Ai1-1−−→

ontoN for each i ∈ N. Define the bijection g : A∞

1-1−−→onto

N2 such thatg(a) = (i, fi(a)). By theorem 0.24 it follows that A∞ is countable. QED

Example 0.26 Let the languageM0 of minimal logic be “generated” by the following process from




a countably infinite set of “atoms” A, such that A does not contain any of the symbols “¬”, “→”,“(” and “)”.

1. A ⊆M0,

2. If µ and ν are any two elements ofM0 then (¬µ) and (µ→ ν) also belong toM0, and

3. No string other than those obtained by a finite number of applications of the above rules belongstoM0.

We prove that theM0 is countably infinite.

Solution There are at least two possible proofs. The first one simply encodes of formulas into uniquenatural numbers. The second uses induction on the structure of formulas and the fact that a count-able union of countable sets yields a countable set. We postpone the second proof to the chapter oninduction. So here goes!

Proof: Since A is countably infinite, there exists a 1 − 1 correspondence ord : A −→ Pwhich uniquely enumerates the atoms in some order. This function may be extended to a func-tion ord′ which includes the symbols “¬”,“(”,“)”,“→”, such that ord′(“¬′′) = 1, ord′(“(”) = 2,ord′(“)′′) = 3, ord′(“ →′′) = 4, and ord′(“A′′) = ord(“A′′) + 4, for every A ∈ A. Let Syms =




A ∪ “¬′′, “(“, “)′′, “ →′′. Clearly ord′ : Syms −→ P is also a 1− 1 correspondence. Hence therealso exist inverse functions ord−1 and ord′−1 which for any positive integer identify a unique symbolfrom the domains of the two functions respectively.

Now consider any string6 belonging to Syms∗. It is possible to assign a unique positive integer tothis string by using powers of primes. Let p1 = 2, p2 = 3, . . . , pi, . . . be the infinite list of primes inincreasing order. Let the function encode : Syms∗ −→ P be defined by induction on the lengths ofthe strings in Syms∗, as follows. Assume s ∈ Syms∗, a ∈ Syms and “′′ denotes the empty string.

encode(“′′) = 1

encode(sa) = encode(s)× pmord′(a)

where s is a string of length m− 1 for m ≥ 1.

It is now obvious from the unique prime-factorization of positive integers that every string in Syms∗

has a unique positive integer as its “encoding” and from any positive integer it is possible to getthe unique string that it represents. Hence Syms∗ is a countably infinite set. Since the languageof minimal logic is a subset of the Syms∗ it cannot be an uncountable. Hence there are only twopossibilities: either it is finite or it is countably infinite.

Claim. The language of minimal logic is not finite.6This includes even arbitrary strings which are not part of the language. For example, you may have strings such as “)¬(”.




Proof of claim. Suppose the language were finite. Then there exists a formula φ in the languagesuch that encode(φ) is the maximum possible positive integer. This φ ∈ Syms∗ and hence is a stringof the form a1 . . . am where each ai ∈ Syms. Clearly

encode(φ) =

m∏i=1

piord′(ai)

. Now consider the longer formula ψ = (¬φ). It is easy to show that

encode(ψ) = 2ord′(“(“) × 3ord

′(“¬′′) ×m∏i=1

pi+2ord′(ai) × pm+3

ord′(“)′′)

and encode(ψ) > encode(φ) contradicting the assumption of the claim.

Hence the language is countably infinite. QED

Not all infinite sets that can be constructed are countable. In other words even among infinite setsthere are some sets that are “more infinite than others”. The following theorem and the form of itsproof was first given by Georg Cantor and has been used to prove several results in logic, mathemat-ics and computer science.




Theorem 0.27 (Cantor’s diagonalization). The powerset of N (i.e. 2N, the set of all subsets of N)is an uncountable set.

Proof: Firstly, it should be clear that 2N is not a finite set, since it can be placed in bijection with2P which is a proper subset of 2N.

Consider any subset A ⊆ N. We may represent this set as an infinite sequence σA composed of 0’sand 1’s such that σA(i) = 1 if i ∈ A, otherwise σA(i) = 0. Let Σ = σ | for eachi ∈ N, σ(i) ∈ 0, 1be the set of all such sequences. It is easy to show that there exists a bijection g : 2N

1-1−−→onto

Σ suchthat g(A) = σA, for each A ⊆ N. Clearly, therefore 2N is countable if and only if Σ is countable.

Hence, if there exists a bijection f : Σ1-1−−→

ontoN, then g f is the required bijection from 2

N to N.On the other hand, if there is no bijection f then 2

N is uncountable if and only if Σ is uncountable.We make the following claim which we prove by Cantor’s diagonalization.

Claim. The set Σ is uncountable.

We prove the claim as follows. Suppose Σ is countable then there exists a bijection h : N1-1−−→

ontoΣ.

In fact let h(i) = σi ∈ Σ, for each i ∈ N. Now consider the sequence ρ constructed in such a manner




that for each i ∈ N, ρ(i) 6= σi(i). In other words,

ρ(i) =

0 if σi(i) = 1

1 if σi(i) = 0

Since ρ is an infinite sequence of 0’s and 1’s, ρ ∈ Σ. But from the above construction it follows thatsince ρ is different from every sequence in Σ it cannot be a member of Σ, leading to a contradiction.Hence the assumption that the bijection h exists is wrong. Hence the assumption that Σ is countablemust be wrong. QED

It is possible to generalize the above theorem and the proof to all powersets as follows.

Theorem 0.28 (The Powerset theorem). There is no 1-1 correspondence between a set and itspowerset.

Proof: Let A be any set and let 2A be its powerset. Assume that g : A −→ 2A is a 1-1 corre-

spondence between A and 2A. This implies for every a ∈ A, g(a) ⊆ A is uniquely determined andfurther for each B ⊆ A, g−1(B) exists and is uniquely determined.




For any a ∈ A, a is called an interior member if a ∈ g(a) and otherwise a is an exterior member.Consider the set

X = x ∈ A | x 6∈ g(x)which consists of exactly the exterior members of A. Since g is a 1-1 correspondence, there exists aunique x ∈ A such that X = g(x). Note that X could be the empty set.

x is either an interior member or an exterior member. If x is an interior member then x ∈ g(x) = X

which contradicts the assumption that X contains only exterior members. If x is an exterior memberthen x 6∈ g(x) = X . But then since x is an exterior member x ∈ X , which is a contradiction. Hencethe assumption that there exists a 1-1 correspondence g between A and 2A must be false. QED

Example 0.29 We show using the Schroeder-Bernstein theorem 0.7 that there exists a bijection be-tween the sets 2P and the real closed-open interval [0, 1). We construct two injective mappingsf : 2P

1-1−→ [0, 1) and g : [0, 1)1-1−→ 2

P as follows: For any A ⊆ P let f (A) = 0.d1d2d3 . . . such thatdi = 1 if i ∈ A and di = 2 otherwise. Clearly for every A there exists a unique image in [0, 1) andno two distinct subsets of P would have identical images. Hence f is injective.

To define g we consider only normal binary representations of real numbers. That is, we consideronly binary representations which do not have an infinite sequence of trailing 1s, since any num-




ber of the form 0.b1b2 . . . bi−101 equals the real number 0.b1b2 . . . bi−110 which is normal. Everyreal number in [1, 0) has a unique normal representation. Now consider the function defined byg(0.b1b2b3 . . .) = i ∈ P | bi = 1. g is clearly a well-defined function and it is injective as well.Hence they are both uncountable sets.

Exercise 0.1

1. Find the fallacy in the proof of the following purported theorem.Theorem: If x = y then 2 = 1.

Proof:1. x = y Given2. x2 = xy Multiply both sides by x3. x2 − y2 = xy − y2 Subtract Y 2 from both sides4. (x + y)(x− y) = y(x− y) Factorize5. x + y = y Cancel out (x− y)

6. 2y = y Substitute x for y, by equation 1.7. 2 = 1 Divide both sides by y

QED

2. Prove that if A ⊆ B then 2A ⊆ 2B.




3. Prove that for any binary relationsR and S on a set A,

(a) (R−1)−1 = R(b) (R∩ S)−1 = R−1 ∩ S−1

(c) (R∪ S)−1 = R−1 ∪ S−1

(d) (R− S)−1 = R−1 − S−1

4. Prove that the composition operation on relations is associative. Give an example of the compo-sition of relations to show that relational composition is not commutative.

5. Prove that the composition of bijections is a bijection. That is, prove that for any bijective totalfunctions f : A

1-1−−→onto

B and g : B1-1−−→

ontoC, their composition is the function g f : A

1-1−−→onto

C.

6. Is the composition of injective functions also injective? Is the composition of surjective functionsalso surjective? Prove or disprove the two statements.

7. Prove that the inverse of a bijective function is also a bijective function.

8. Prove that for any binary relations R, R′ from A to B and S, S ′ from B to C, if R ⊆ R′ andS ⊆ S ′ thenR;S ⊆ R′;S ′




9. Prove or disprove7that relational composition satisfies the following distributive laws for rela-tions, whereR ⊆ A×B and S, T ⊆ B × C.

(a) R; (S ∪ T ) = (R;S) ∪ (R; T )

(b) R; (S ∩ T ) = (R;S) ∩ (R; T )

(c) R; (S − T ) = (R;S)− (R; T )

10. Prove that forR ⊆ A×B and S ⊆ B × C, (R;S)−1 = (S−1); (R−1).

11. Show that a relationR on a set A is

(a) antisymmetric if and only ifR∩R−1 ⊆ IA(b) transitive if and only ifR;R ⊆ R(c) connected if and only if (A× A)− IA ⊆ R ∪R−1

12. Consider any reflexive relation R on a set A. Does it necessarily follow that A is not asymmet-ric? IfR is asymmetric does it necessarily follow that it is irreflexive?

13. Prove that

(a) Nn, for any n > 0 is a countably infinite set,7that is, find an example of appropriate relations which actually violate the equality




(b) If Ai|i ≥ 0 is a countable collection of pair-wise disjoint sets (i.e. Ai ∩ Aj = ∅ for alli 6= j) then A =

⋃i≥0Ai is also a countable set.

(c) N∗ the set of all finite sequences of natural numbers is countable.

14. Prove that

(a) Nω the set of all infinite sequences of natural numbers is uncountable,

(b) the set of all binary relations on a countably infinite set is an uncountable set,

(c) the set of all total functions from N to N is uncountable.

15. Prove that there exists a bijection between the set 2N and the open interval (0, 1) of real numbers.What can you conclude about the cardinality of the set 2N in relation to the set R?

16. Prove that the composition operation on relations is associative. Give an example of the compo-sition of relations to show that relational composition is not commutative in general.

17. Consider any reflexive relation R on a set A. Does it necessarily follow that R is not asymmet-ric? IfR is asymmetric does it necessarily follow that it is irreflexive?

18. Prove that for any relationR on a set A,

(a) S = R∗ ∪ (R∗)−1 and T = (R∪R−1)∗ are both equivalence relations.




(b) Prove or disprove: S = T .

19. Given any preorderR on a set A, prove that the kernel of the preorder defined asR∩R−1 is anequivalence relation.

20. Consider any preorderR on a set A. We give a construction of another relation as follows. Foreach a ∈ A, let [a]R be the set defined as [a]R = b ∈ A | aRb and bRa. Now consider the setB = [a]R | a ∈ A. Let S be a relation on B such that for every a, b ∈ A, [a]RS [b]R if and onlyif aRb. Prove that S is a partial order on the set B.

21. For any two sets A and B, A B if there exists an injective function f : A1-1−→ B.

(a) Prove that is a preorder on any collection of sets.

(b) Prove that any bijection between sets defines an equivalence relation on the collection of sets.




0.9. Induction Principles

Theorem: All natural numbers are equal.Proof: Given a pair of natural numbers a and b, we prove they are equal by performingcomplete induction on the maximum of a and b (denoted max(a, b)).Basis. For all natural numbers less than or equal to 0, the claim holds.Induction hypothesis. For any a and b such that max(a, b) ≤ k, for some natural k ≥ 0,a = b.Induction step. Let a and b be naturals such that max(a, b) = k + 1. It follows thatmax(a − 1, b − 1) = k. By the induction hypothesis a − 1 = b − 1. Adding 1 on both sideswe get a = b QED.

Fortune cookie on Linux

0.10. Mathematical Induction

Anyone who has had a good background in school mathematics must be familiar with two uses ofinduction.

1. definition of functions and relations by mathematical induction, and




2. proofs by the principle of mathematical induction.

Example 0.30 We present below some familiar examples of definitions by mathematical induction.

1. The factorial function on natural numbers is defined as follows.

Basis. 0! = 1

Induction step. (n + 1)! = n!× (n + 1)

2. The n-th power (where n is a natural number) of a real number x is often defined as

Basis. x1 = x

Induction step. xn+1 = xn × x

3. For binary relations R, S on A we define their composition (denoted R;S) as follows.

R;S = (a, c) | for some b ∈ A, (a, b) ∈ R and (b, c) ∈ S

We may extend this binary relational composition to an n-fold composition of a single relationR as follows.

Basis. R1 = R




Induction step. Rn+1 = R;Rn

Similarly the principle of mathematical induction is the means by which we have often proved (asopposed to defining) properties about numbers, or statements involving the natural numbers. Theprinciple may be stated as follows.

Principle of Mathematical Induction – Version 1A property P holds for all natural numbers provided

Basis. P holds for 0, and

Induction step. For arbitrarily chosen n > 0,P holds for n− 1 implies P holds for n.

The underlined portion, called the induction hypothesis, is an assumption that is necessary forthe conclusion to be proved. Intuitively, the principle captures the fact that in order to prove anystatement involving natural numbers, it is suffices to do it in two steps. The first step is the basis and




needs to be proved. The proof of the induction step essentially tells us that the reasoning involved inproving the statement for all other natural numbers is the same. Hence instead of an infinitary proof(one for each natural number) we have a compact finitary proof which exploits the similarity of theproofs for all the naturals except the basis.

Example 0.31 We prove that all natural numbers of the form n3 + 2n are divisible by 3.

Proof:

Basis. For n = 0, we have n3 + 2n = 0 which is divisible by 3.

Induction step. Assume for an arbitrarily chosen n ≥ 0, n3 + 2n is divisible by 3. Now consider(n + 1)3 + 2(n + 1). We have

(n + 1)3 + 2(n + 1) = (n3 + 3n2 + 3n + 1) + (2n + 2)

= (n3 + 2n) + 3(n2 + n + 1)

which clearly is divisible by 3.

QED




Several versions of this principle exist. We state some of the most important ones. In such cases,the underlined portion is the induction hypothesis. For example it is not necessary to consider 0 (oreven 1) as the basis step. Any integer k could be considered the basis, as long as the property is tobe proved for all n ≥ k.

Principle of Mathematical Induction – Version 2A property P holds for all natural numbers n ≥ k for some natural number k, provided

Basis. P holds for k, andInduction step. For arbitrarily chosen n > k,P holds for n− 1 implies P holds for n.

Such a version seems very useful when the property to be proved is either not true or is undefinedfor all naturals less than k. The following example illustrates this.

Example 0.32 Every positive integer n ≥ 8 is expressible as n = 3i + 5j where i, j ≥ 0.Proof: .




Basis. For n = 8, we have n = 3 + 5, i.e. i = j = 1.

Induction step. Assuming for an arbitrary n ≥ 8, n = 3i + 5j for some naturals i and j, considern + 1. If j = 0 then clearly i ≥ 3 and we may write n + 1 as 3(i − 3) + 5(j + 2). Otherwisen + 1 = 3(i + 2) + 5(j − 1).

QED

However it is not necessary to have this new version of the Principle of mathematical induction atall as the following reworking of the previous example shows.

Example 0.33 The property of the previous example could be equivalently reworded as follows.“For every natural number n, n + 8 is expressible as n + 8 = 3i + 5j where i, j ≥ 0”.Proof: .

Basis. For n = 0, we have n + 8 = 8 = 3 + 5, i.e. i = j = 1.

Induction step. Assuming for an arbitrary n ≥ 0, n+8 = 3i+5j for some naturals i and j, considern+ 1. If j = 0 then clearly i ≥ 3 and we may write (n+ 1) + 8 as 3(i− 3) + 5(j + 2). Otherwise(n + 1) + 8 = 3(i + 2) + 5(j − 1).




QED

In general any property P that holds for all naturals greater than or equal to some given k may betransformed equivalently into a property Q, which reads exactly like P except that all occurrencesof “n” in P are systematically replaced by “n + k”. We may then prove the property Q using thefirst version of the principle.

What we have stated above informally is, in fact a proof outline of the following theorem.

Theorem 0.34 The two principles of mathematical induction are equivalent. In other words, everyapplication of PMI - version 1 may be transformed into an application of PMI – version 2 andvice-versa.

In the sequel we will assume that the principle of mathematical induction always refers to the firstversion.

0.11. Complete Induction

Often in inductive definitions and proofs it seems necessary to work with an inductive hypothesisthat includes not just the predecessor of a natural number, but some or all of their predecessors as




well.

Example 0.35 The definition of the following sequence is a case of precisely such a definition wherethe function F (n) is defined for all naturals as follows.

Basis. F (0) = 0

Induction step

F (n + 1) =

1 if n = 0

F (n) + F (n− 1) otherwise

This is the famous Fibonacci8 sequence.

One of the properties of the Fibonacci sequence is that the sequence converges to the “golden ratio”9.For any inductive proof of properties of the Fibonacci numbers, we would clearly need to assumethat the property holds for the two preceding numbers in the sequence.

In the following, we present a principle that assumes a stronger induction hypothesis. And hencethe principle itself seems “weaker” than the previous versions.

8named after Leonardo of Fibonacci.9one of the solutions of the equation x2 = x+ 1. It was considered an aesthetically pleasing aspect ratio for buildings in ancient Greek architecture.




Principle of Complete Induction (PCI)A property P holds for all natural numbers provided

Basis. P holds for 0.

Induction step. For an arbirary n > 0

P holds for every m, 0 ≤ m < n implies P holds for n

Example 0.36 Let F (0) = 0, F (1) = 1, F (2) = 1, . . . , F (n + 1) = F (n) + F (n − 1), . . .

be the Fibonacci sequence. Let φ be the “golden ratio” (1 +√

5)/2. We now show that the propertyF (n + 1) ≤ φn holds for all n.Proof: By the principle of complete induction on n.

Basis. For n = 0, we have F (1) = φ0 = 1.

Induction step. Assuming the property holds for all m, 0 ≤ m ≤ n − 1, for an arbitrarily chosen




n > 0, we need to prove that F (n + 1) ≤ φn.

F (n + 1) = F (n) + F (n− 1)

≤ φn−1 + φn−2 by the induction hypothesis= φn−2(φ + 1)

= φn since φ2 = φ + 1

QED

Note that the feature distinguishing the principle of mathematical induction from that of completeinduction is the induction hypothesis. It appears to be much stronger in the latter case. However,in the following example we again prove the property in example 0.36 but this time we use theprinciple of mathematical induction instead.

Example 0.37 Let P(n) denote the property

“F (n + 1) ≤ φn.”

Rather than prove the original statement “For all n, P(n)” we instead consider the property Q(n)

which we define as




“For every m, 0 ≤ m ≤ n, P(m).”

and prove the statement “For all n, Q(n)”. This property can now be proved by mathematicalinduction as follows. The reader is encouraged to study the following proof carefully.Proof: By the principle of mathematical induction on n.

Basis. For n = 0, we have F (1) = φ0 = 1.

Induction step. Assuming the property Q(n− 1), holds for an arbitrarily chosen n > 0, we need toprove the property Q for n. But for this it suffices to prove the property P for n, since Q(n) isequivalent to the conjunction of Q(n− 1) and P(n). Hence we prove the property P(n).

F (n + 1) = F (n) + F (n− 1)

≤ φn−1 + φn−2 by the induction hypothesis= φn−2(φ + 1)

= φn since φ2 = φ + 1

QED

The above example shows quite clearly that the induction hypothesis used in any application ofcomplete induction though seemingly stronger, can also lead to the proof of seemingly stronger




properties. But in fact, in the end the proofs are almost identical. These proofs lead us then naturallyinto the next theorem.

Theorem 0.38 The two principles of mathematical induction are equivalent. In other words, everyapplication of PMI may be transformed into an application of PCI and vice-versa.

Proof: We need to prove the following two claims.

1. Any proof of a property using the principle of mathematical induction, is also a proof of thesame property using the principle of complete induction. This is so because the only possiblechange in the nature of two proofs could be because they use different induction hypotheses.Since the proof by mathematical induction uses a fairly weak assumption which is sufficient toprove the property, strengthening it in any way does not need to change the rest of the proof ofthe induction step.

2. For every proof of a property using the principle of complete induction, there exists a corre-sponding proof of the same property using the principle of mathematical induction. To provethis claim we resort to the same trick employed in example 0.36. We merely replace each oc-currence of the original property in the form P(n) by Q(n), where the property Q is definedas




“For every m, 0 ≤ m ≤ n, P(m).”

Since Q(0) is the same as P(0) there is no other change in the basis step of the proof. In theoriginal proof by complete induction the induction hypothesis would have read

For arbitrarily chosen n > 0, for all m, 0 ≤ m ≤ n− 1, P(m)

whereas in the new proof by mathematical induction the induction hypothesis would read

For arbitrarily chosen n > 0, Q(n− 1)

Clearly the two induction hypotheses are logically equivalent. Hence the rest of the proof ofthe induction step would suffer no other change. The basis step and the induction step wouldtogether constitute a proof by mathematical induction of the property Q for all naturals n. SinceQ(n) logically implies P(n) it follows that the proof of property P for all naturals has been doneby mathematical induction.

QED

The natural numbers are themselves defined as the smallest set N such that 0 ∈ N and whenevern ∈ N, n+ 1 also belongs to N. Therefore we may state yet another version of PMI from which theother versions previously stated may be derived. The intuition behind this version is that a property




P may also be considered as defining a set S = x | x satisfies property P. Therefore if a propertyP is true for all natural numbers the set defined by the property must be the set of natural numbers.This gives us the last version of the principle of mathematical induction.

Principle of Mathematical Induction – Version 0A set S = N provided

Basis. 0 ∈ S, and

Induction step. For arbitrarily chosen n > 0,n− 1 ∈ S implies n ∈ S.

We end this section with an example of the use of induction to prove that for any n ∈ N, the set ofall n-tuples of natural numbers is only countably infinite.

Example 0.39 Assume there exists a 1-1 correspondence f2 : N2 → N. Use this fact to prove byinduction on n, that there exists a 1-1 correspondence fn : Nn → N, for all n ≥ 2.




Solution. In general, to prove that a given function F : A → B is a 1-1 correspondence, we mayprove it by contradiction. Then there are 2 cases to consider.

1. F is non-injective. Then there exist elements a, a′ ∈ A, such that a 6= a′ and F (a) = F (a′).

2. F is non-surjective. Then there exists an element b ∈ B such that F (a) 6= b, for any a ∈ A.

It is also easy to show that if F : A→ B and G : B → C are both bijections then their compositionG F : A→ C is also a bijection.

We now proceed to prove by induction on n.

Basis. For n = 2 it is given that f2 is a bijection.

Induction step. Assume the induction hypothesis,

For some n ≥ 2 there exists a bijection fn : Nn → N.

We need to prove that there exists a 1-1 correspondence (bijection) between Nn+1 and N. We provethis by constructing a function fn+1 : Nn+1 → N.




Let g : Nn+1 → (Nn × N) be the function defined by

g(x1, . . . , xn, xn+1) = ((x1, . . . , xn), xn+1)

Claim: g is a 1-1 correspondence. The proof is trivial.

Let h : Nn+1 → (N× N) be defined by

h(x1, . . . , xn, xn+1) = (fn(x1, . . . , xn), xn+1)

Claim: h is a 1-1 correspondence. It can be proved from the fact that fn : Nn → N is a bijection.

Since f2 is also a bijection, it follows that the composition of h and f2, viz. f2 h : Nn+1 → N isalso a bijection. Hence fn+1 defined as fn+1

df= f2 h, i.e.

fn+1((x1, . . . , xn, xn+1)) = f2(h((x1, . . . , xn, xn+1)))

is a bijection.

Note.

1. Many people assume automatically that Nn+1 = Nn ×N or Nn+1 = N×Nn. But while it is true




that there exists a bijection between Nn+1 and Nn×N, they are not equal as sets. Hence we havedefined the function g though it is not really necessary for the proof.

2. Very often countability is assumed by people and they try to argue that since the sets are count-able there should be a bijection. But it should be clear that estabilishing a bijection is necessaryfirst to prove that the required sets are countable. In fact the aim of this problem is to constructa bijection to prove that the sets Nn are all countable.

0.12. Structural Induction

In many cases such as the syntactic definitions of programming languages, their semantics and theconstruction of “recursive” data types in languages such as ML and Java, it is helpful to considera form of induction called structural induction. This form of induction enables us to prove fairlygeneral properties about the datatypes so constructed and is a convenient tool for defining functionsand proving properties about data types and programs.

Definition 0.40 Let U be a set called the Universe, B a nonempty subset of U called the basis, andlet K called the constructor set be a nonempty set of functions, such that each function f ∈ K hasassociated with it a unique natural number n ≥ 0 called its arity and denoted by α(f ). If a function




f has an arity of n ≥ 0, then f : Un −→ U. Let X be the family of subsets of U such that eachX ∈X satisfies the following conditions.

Basis. B ⊆ X

Induction step. if f ∈ K is of arity n ≥ 0, a1, . . . , an ∈ X then f (a1, . . . , an) ∈ X

A set A is said to be inductively defined from B, K, U if it is the smallest set (under the subsetordering ⊆) satisfying the above conditions i.e.

A =⋂X∈X

X (1)

The set A is also said to have been generated from the basis B and the rules of generation f ∈ K.

As in the other induction principles the underlined portion is the induction hypothesis. It may notbe absolutely clear whether A defined as in (1) satisfies the two conditions of definition 0.40.

Lemma 0.41 A ∈X where A and X are as in definition 0.40.

Proof: We need to show that A satisfies the two conditions that each X ∈ X satisfies. It is easyto see that since since B ⊆ X for each X ∈ X , we have B ⊆ A and hence A does satisfy the




basis condition. As for the induction step, consider any f ∈ K and elements a1, . . . , aα(f) ∈ A.By equation (1) we have a1, . . . , aα(f) ∈ X for every X ∈ X . Therefore f (a1, . . . , aα(f)) ∈ X forevery X ∈X which implies f (a1, . . . , aα(f)) ∈ A. Hence A ∈X . QED

We may also think ofA as the smallest set (under the subset ordering) which satsfies the set equation

X = B ∪⋃f (Xn) | f ∈ K,α(f ) = n ≥ 0, X ⊆ U (2)

in the unknown X , where f (Xn) = a | a = f (a1, . . . , an), for some (a1, . . . , an) ∈ Xn. Weshow in lemma 0.44 that such equations may be solved for their unique smallest solution.

Definition 0.42 Let U, B, K be as in definition 0.40. Then a sequence [a1, . . . , am] of elements ofU is called a construction sequence for am if for all i = 1, . . . ,m either ai ∈ B or there exists aconstructor f ∈ K, of arity n > 0, and 0 < i1, . . . , in < i such that f (ai1, . . . , ain) = ai.ai is said to directly depend on each of the elements ai1, . . . , ain (denoted aij ≺1 ai for eachj ∈ 1, . . . , n). ai depends on aj, denoted aj ≺ ai if either aj ≺1 ai or there exists some i′ suchthat aj ≺ ai′ and ai′ ≺1 ai.

A contains exactly all those elements of U which have a construction sequence. The basis alongwith the constructor functions are said to define the terms generated by the rules of construction ofdefinition 0.40.




Example 0.43 Consider the following definition of a subclass of arithmetic expressions, called am-expressions generated only from natural numbers and the addition and multiplication operations.The rules may be expressed in English as follows.

Basis Every natural number is an am-expression.

Induction step

addition If e and e′ are am-expressions then add(e, e′) is an am-expression.

multiplication If e and e′ are am-expressions then mult(e, e′) is an am-expression.

Initiality Only strings that are obtained by a finite number of applications of the above rules aream-expressions (nothing else is an am- expression).

In the above definition of am-expressions N is the basis, K = add,mult is the set of constructorseach of arity 2 and the universe U consists of all possible finite sequences of symbols drawn fromthe natural numbers and applications of the constructors. The smallest set generated from finitesequences of applications of the basis and induction steps is the set of am-expressions involving onlythe naturals and the 2-ary constructors add and mult such that every application of a constructorhas exactly two operands each of which in turn is either a natural number or constructed in a




similar fashion. “0”, “add(0, 1)”, “mult(add(1, 0),mult(1, 1))” are all am-expressions. On theother hand,

1. “add(0)” is not an am-expression since the arity of add is 2,

2. “0, 1” is not an am-expression since it is not a natural number (it is actually a sequence of twonatural numbers) ,

3. “mult(∞, 0)” is not an am-expression since∞ is not a natural.

The am-expression “mult(add(1, 0),mult(1, 1))” has the following possible construction sequences.

1. [1, 0, add(1, 0), 1, 1,mult(1, 1),mult(add(1, 0),mult(1, 1))]

2. [1, 0, add(1, 0),mult(1, 1),mult(add(1, 0),mult(1, 1))] where replications of am-expressions havebeen omitted.

3. [1, 0,mult(1, 1), add(1, 0),mult(add(1, 0),mult(1, 1))] since it does not matter in which orderthe two operands of the last element in the sequence occur in the construction sequence as longas they precede the final am-expression.




A convenient shorthand notation called the Backus-Naur Form (BNF) is usually employed to expressthe rules of generation . For the set of am-expressions defined above the BNF is as follows.

e, e′ ::= n ∈ N | add(e, e′) |mult(e, e′)

It is possible to relate the notions of dependence in a construction sequence to the constructionprocess by partially ordering the process of construction of elements in an inductively generated setas follows. Let B, K and U be as in definition 0.40. Consider the infinite sequence of sets

[A0 ⊆ A1 ⊆ A2 ⊆ A3 ⊆ · · · ] (3)

defined by mathematical induction as A0 = B and for each i ≥ 0, Ai+1 = Ai ∪ f (a1, . . . , aα(f)) |f ∈ K, a1, . . . , aα(f) ∈ Ai. Now consider the set

A∞ =⋃i≥0

Ai (4)

The following lemma shows that A∞ = A and is indeed the smallest solution to the equation (2).Moreover (3) gives a construction of the smallest solution by mathematical induction.

Lemma 0.44 Let B, K and U be as in definition 0.40 and A∞ be as in equation (4). Then A = A∞.




Proof: By definition (4) Ai ⊆ A∞ for each i ∈ N

A0 ⊆ A1 ⊆ A2 ⊆ A3 ⊆ · · · ⊆ A∞

We structure the proof in the form of two claims. Firstly we show that A∞ ∈ X i.e. A∞ is a setthat satsfies the conditions in definition 0.40. Secondly, we prove that A∞ ⊆ A. It follows then thatA = A∞ since A ⊆ X for each X ∈X .

Claim. A∞ ∈X .

Proof of claim. Since B = A0 ⊆ A∞, A∞ does satisfy the basis condition. Consider any f ∈ Kand elements a1, . . . , aα(f) ∈ A∞. By the definition of A∞, there exist indices i1, . . . , iα(f) ∈ N suchthat a1 ∈ Ai1, . . . , aα(f) ∈ Aiα(f)

. Let i ≥ max(i1, . . . , iα(f)). It is clear from the definition of Ai

that a1, . . . , aα(f) ∈ Ai and hence f (a1, . . . , aα(f)) ∈ Ai+1 ⊆ A∞. Hence A∞ satisfies the secondcondition too and therefore A∞ ∈X . a

Claim. A∞ ⊆ A.

Proof of claim. We prove by mathematical induction on indices i, that Ai ⊆ A for every indexi. The basis is trivial since we know that A0 = B ⊆ A. For the induction step assume for somek ≥ 0, Ak ⊆ A. We have Ak+1 = Ak ∪ f (a1, . . . , aα(f)) | f ∈ K, a1, . . . , aα(f) ∈ Ak. For anya ∈ Ak we already know by the induction hypothesis that a ∈ A. Any a ∈ Ak+1 − Ak is then of




the form a = f (a1, . . . , aα(f)) where a1, . . . , aα(f) ∈ Ak. Again by the induction hypothesis we havea1, . . . , aα(f) ∈ A and since A ∈X , a = f (a1, . . . , aα(f)) ∈ A. Hence Ak+1 ⊆ A. a QED

Lemma 0.44 and its proof, in addition to showing us how to obtain least solutions to equations of theform (2) also show the relationship to the principle of mathematical induction. Further the lemmagives us a way to quantify dependence of elements in a construction sequence by assigning eachelement an order number.

Definition 0.45 The height of any element a (denoted4(a)) in an inductively generated set A is theleast index i in the monotonic sequence (3) such that a ∈ Ai.

In any construction sequence [a1, . . . , am], ai ≺ aj only if the height of ai is less than the height ofaj.

Example 0.46 Let S ⊆ N be the smallest set of numbers defined by

n ::= 0 | n + 1

Clearly this defines the smallest set containing 0 and closed under the successor operation on thenaturals. It follows that S = N. Notice that the above BNF is merely a rewording of the principle ofmathematical induction version 0.




Example 0.46 shows that mathematical induction is merely a particular case of structural induction.

Example 0.47 The language of minimal logic was defined in example 0.26. We may redefine thelanguage by the following BNF

µ, ν ::= a ∈ A | (¬µ) | (µ→ ν)

We prove that the language is countable.

Proof: We first begin by classifying the formulas of the language according to their depth. Let Mk

be the set of formulas of the language such that each formula has a depth at most k for k ≥ 0. We

assume that M0 = A and Mk+1 = Mk ∪ (¬µk), (µk → νk) | µk, νk ∈ Mk. Let¬Mk be the set of

all formulas of depth k + 1 of the form “(¬µk)” and let→Mk be the set of all formulas of the form

“(µk → νk)”, where µk, νk ∈Mk. We then have

Mk+1 = Mk∪¬Mk ∪

→Mk

= Mk ∪ (¬Mk −Mk) ∪ (

→Mk −Mk)

= Mk∪¬

Nk+1 ∪→

Nk+1




Here Nk+1 =¬

Nk+1 ∪→

Nk+1 represents the set of all formulas of depth exactly k + 1.¬

Nk+1 consists

of exactly those formulas of depth k + 1 whose root operator is ¬. Simlarly→

Nk+1 represents the setof all formulas of depth exactly k + 1, whose root operator is→. Hence the three sets are mutuallydisjoint.

Mk∩¬

Nk+1 = ∅, Mk∩→

Nk+1 = ∅,¬

Nk+1 ∩→

Nk+1 = ∅

The entire language may then be defined as the setM0 =⋃k≥0Mk = A ∪

⋃k>0

¬Nk ∪

⋃k>0

→Nk

Claim. Each of the sets Mk,¬

Nk+1 and→

Nk+1 is countably infinite for all k ≥ 0.Proof of claim. We prove this claim by induction on k. The basis is M0 = A and it is given that it iscountably infinite. The induction step proceeds as follows. We have by the induction hypothesis thatMk is countably infinite. Hence there is a bijection numk : Mk

1-1−−→onto

N. We use numk to construct

the 1 − 1 correspondence numk+1 as follows: We may use numk to define a bijection between¬Nk

and N. Similarly there exists a 1-1 correspondence between→

Nk+1 and N × N given by the ordered

pair of numbers (numk(µk), numk(νk)) for each (µk → νk) ∈→

Nk+1. But we know that there is a 1-1correspondence diag : N× N 1-1−−→

ontoN.




Hence each of the 3 sets Mk,¬

Nk+1 and→

Nk+1 is countably infinite. Their union is clearly countablyinfinite by the following 1-1 correspondence.

numk+1(µk+1) =

3× numk(µk+1) if µk+1 ∈Mk

3× numk(µk) + 1 if µk+1 ≡ ¬µk ∈¬

Nk+1

3× diag(numk(µk), numk(νk)) + 2 if µk+1 ≡ µk → νk ∈→Nk

Hence Mk+1 is countably infinite.

Having proved the claim it follows (from the fact that a countable union of countably infinite setsyields a countably infinite set) that M the language of minimal logic is a countably infinite set. QED

We present below a generalization of the principle of mathematical induction to arbitrary inductivelydefined sets. It provides us a way of reasoning about the properties of structures that are inductivelydefined.

Theorem. The Principle of Structural Induction (PSI). Let A ⊆ U be inductively defined by thebasis B and the constructor set K.




Principle of Structural Induction (PSI)A property P holds for all elements ofA provided

Basis. P is true for all basis elements.Induction step. For each f ∈ K, P holds for elements a1, . . . , aα(f) ∈ A implies P

holds for f (a1, . . . , aα(f)).

Proof: Let P be a property of the elements of U that satisfies the conditions above. Let C be theset of all elements of A that satisfy the property P, i.e. C = a ∈ A | P holds for a. It is clearthat B ⊆ C ⊆ A. To show that C = A it suffices to prove that A − C = ∅. Consider the sequenceof sets defined in (3) and the set A =

⋃i≥0

Ai as given in equation (4). We prove that for all i ≥ 0,

Ai ⊆ C by assuming that there is a smallest i ≥ 0 such that Ai 6⊆ C. Since A0 = B ⊆ C, Ai 6⊆ C

implies i > 0. Consider the smallest i > 0 such that Ai 6⊆ C. There exists a ∈ Ai − Ai−1 whichdoes not satisfy the property P. Since a 6∈ Ai−1, we have a = f (a1, . . . , aα(f)) for some f ∈ K suchthat a1, . . . , aα(f) ∈ Ai−1. Further by assumption Ai−1 ⊆ C and hence property P holds for eachof a1, . . . , aα(f). This implies by the induction step that P holds for a, contradicting the assumption




that a does not satisfy P. Hence there is no smallest i such that Ai 6⊆ C. Therefore for all i ≥ 0,Ai ⊆ C and hence A ⊆ C from which it follows that P holds for every element of A. QED

Example 0.48 Consider the following BNF of arithmetic expressions

e, e′ ::= n ∈ N | (e + e) | (e× e′) | (e− e′)

Given a string w of symbols, a string u is called a prefix of w if there is a string v such that w = u.v , where ’.’ denotes the (con)catenation) operation on strings. Clearly the empty string ε is a prefixof every string and every string is a prefix of itself. u is called a proper prefix of w if v is a nonemptystring.

Let e be any expression generated by the above BNF and let e′ be a prefix10 of e. Further, let L(e′)

and R(e′) denote respectively the numbers of left and right parentheses in e′. Let P be the propertyof strings e in the language of arithmetic expressions given by

For every prefix e′ of e, L(e′) ≥ R(e′).

We use the principle of structural induction (theorem 0.12) to prove that property P holds for all10e′ may or may not be an expression of the language.




expressions in the language.

Basis. It holds for all n ∈ N because no natural number has any parentheses.

Induction Hypothesis (IH).

For any e of the form “(f g)”, where ∈ +, x,− and f , g are themselves expressions inthe language

1. For every prefix f ′ of f , L(f ′) ≥ R(f ′) and

2. For every prefix g′ of g, L(g′) ≥ R(g′)

Induction Step. We do a case analysis of all the possible prefixes of e.

• Case e′ = ε. L(e′) = R(e′).

• Case e′ = “(“. L(e′) = 1 > 0 = R(e′).

• Case e′ = “(f ′“. By the induction hypothesis we have L(f ′) ≥ R(f ′) and hence L(e′) =

1 + L(f ′) > R(f ′) = R(e′).1




• Case e′ = “(f ′ “. By the induction hypothesis we have L(f ′) ≥ R(f ′) and hence L(e′) =

1 + L(f ′) > R(f ′) = R(e′).• Case e′ = “(f g′“. By the induction hypothesis we have L(f ) ≥ R(f ) and L(g′) ≥ R(g′).

Hence L(e′) = 1 + L(f ) + L(g′) > R(f ) + R(g′) = R(e′).• Case e′ = “(f g“. By the induction hypothesis we have L(f ) ≥ R(f ) and L(g) ≥ R(g).

Hence L(e′) = 1 + L(f ) + L(g) > R(f ) + R(g) = R(e′).• Case e′ = e = “(f g)′′. By the induction hypothesis we have L(f ) ≥ R(f ) and L(g) ≥R(g). Hence L(e′) = L(e) = 1 + L(f ) + L(g) ≥ R(f ) + R(g) + 1 = R(e′) = R(e).

We leave it as an exercise for the reader to prove that every proof by the principle of mathematicalinduction may also be translated into a proof by the principle of structural induction (see example0.46 and the equivalences between the various versions of the principle of mathematical inductionand complete induction). We are now ready to show that even though structural induction seems tobe more general than mathematical induction they are in fact equivalent in power. In other words,every proof by the principle of structural induction may also rewritten as a proof by (some version)of the principle of mathematical induction or complete induction. To do this we need the height (seedefinition 0.45) of each element in an inductively defined set.




Theorem 0.49 Every proof using the principle of structural induction (PSI) may be replaced by aproof using the principle of complete induction (PCI).

Proof: Let A be inductively defined by B, K, U and let P be a property of elements of A that hasbeen proved by the principle of structural induction. Let the property Q(n) for each natural numbern be defined as

The property P holds for all elements of height n

Basis. The basis step n = 0 of the proof of property Q proceeds exactly as the basis step of the proofby PSI with as many cases are required in the proof of by PSI.

The induction hypothesis. The induction hypothesis is the assumption that Q(m) holds for all 0 ≤m < n.

The induction step The induction step is simply that if the induction hypothesis holds for allm < n

then Q holds for n. The proof by PSI for each constructor in the induction step is a case in theinduction step of the proof Q(n).

QED




Definition 0.50 Let A be inductively defined by B, K, and U and let V be any arbitrary set. Thenh : A −→ V is said to be an inductively defined function if h(b) = h0(b) and h(f (a1, . . . , aα(f))) =

hf(h(a1), . . . , h(aα(f))) where h0 : B −→ V is a function and for every n-ary constructor f ∈ K,there is an n-ary function hf : V n −→ V . A relation R ⊆ A× V is said to be inductively definedif

R = (b, h0(b)) | b ∈ B ∪ (f (a1, . . . , aα(f)), hf(v1, . . . , vα(f))) | a1Rv1, . . . , aα(f)Rvα(f)

Example 0.51 Consider the language P0 of propositional logic, where the basis is a countable setA of atoms, and the language is defined by the BNF

φ, ψ ::= p ∈ A | (¬φ) | (φ ∨ ψ) | (φ ∧ ψ)

Further let B = 〈0, 1, ,+, .〉 be the algebraic system whose operations are defined as follows.

0 = 1 and 1 = 0

1 + 1 = 0 + 1 = 1 + 0 = 1 and 0 + 0 = 0

0× 0 = 0× 1 = 1× 0 = 0 and 1× 1 = 1

Let τ0 : A −→ 0, 1 be a truth value assignment for the propositional atoms. Then the truth valuesof propositional formulas in P0 under τ , is the inductively defined function T : P0 −→ 0, 1




where τ− = , τ∨ = + and τ∧ = ×. Therefore T extends τ0 to P0 as follows.

T [p] = τ0(p) p ∈ AT [(¬φ)] = T [φ]

T [(φ ∨ ψ)] = T [φ] + T [ψ]

T [(φ ∧ ψ)] = T [φ]× T [ψ]

0.13. Simultaneous Induction

One question that naturally arises is that if induction is merely equation solving, then are thereproblems or properties which require us to solve a system of simultaneous equations? We answerthis question with the following example.

Example 0.52 Consider the following BNF which consists of the generation of three different setsof bit strings (i.e. sequences of 0s and 1s) which are all mutually dependent.

z ::= 0 | z0 | i1i ::= 1 | t0 | z1

t ::= i0 | t1




In effect this BNF defines a system of three (simultaneous) equations.

Z = 0 ∪ z0 ∈ 2+ | z ∈ Z ∪ i1 ∈ 2+ | i ∈ II = 1 ∪ t0 ∈ 2+ | t ∈ T ∪ z1 ∈ 2+ | z ∈ ZT = i0 ∈ 2+ | i ∈ I ∪ t1 ∈ 2+ | t ∈ T

For any bit string s let [s] denote the non-negative integer j such that s is a binary representation ofj (there could be leading zeroes). Suppose we need to prove the following property PZ of the set Z.

PZ : Z = z ∈ 2+ | [z] is a non-negative multiple of 3

The intuition which guides the BNF above (or equivalently the above system of equations) is thefollowing.

• Each z ∈ Z is such that [z] = 3l for some l ∈ N i.e. [z] is a multiple of 3. In particular, thebit-string 0 ∈ Z. The other sets in the definition of Z are constructed so that if [z] = 3l, for somel ∈ N, then [z0] = 6l is also a multiple of 3 and for each i ∈ I , if [i] = 3m + 1 for some m ∈ Nthen i1 ∈ Z since [i1] = 2(3m + 1) + 1 = 3(2m + 1) is a multiple of 3.

• Likewise the definitions of I and T are such that for each i ∈ I , [i] = 3m + 1 for some naturalm and for each t ∈ T , [t] = 3n + 2 for some natural n.




The same intuition also guides the proof of property PZ . However, the proof is dependent uponproperties PI for the set I and PT for T where

PI : I = i ∈ 2+ | [i] = 3m + 1,m ∈ N

and

PT : T = t ∈ 2+ | [t] = 3n + 2, n ∈ N

Hence an inductive proof of property PZ requires a simultaneous proof of properties PI and PT .

Lemma 0.53 The elements of the sets Z, I and T satisfy properties PZ , PI and PT respectively.

Proof: We proceed by simultaneous induction on the lengths of strings of 2+.

Basis. There are exactly two bit strings of length 1 viz, “0” and “1” and it follows easily that 0 ∈ Zand 1 ∈ I .





For some k > 0,

0. each z′ ∈ Z such that |z′| = k satisfies property PZ ,

1. each i′ ∈ I such that |i′| = k satisfies property PI , and

2. each t′ ∈ T such that |t′| = k satisfies property PT ,

Induction Step. Consider strings of length k + 1. We need to show that

0. Property PZ holds for z = z′0 and z = i′1, where |z′| = |i′| = k,1. Property PI holds for i = t′0 and i = z′1, where |t′| = |z′| = k,2. Property PT holds for t = i′0 and t = t′1, where |i′| = |t′| = k,

We leave the rest of the proof to be completed by the interested reader.

QED




0.14. Well-ordered Induction

We have already seen in definition 0.10 that a linear or total order is a connected partial order. Wenow use this and definition 0.16 in defining a well-ordered set.

Definition 0.54 A well-ordering of a set is a well-founded total ordering of the set. A well-orderedset is sometimes called a woset.

Fact 0.55 Let 〈A,≤〉 be a (nonempty) well-ordered set.

1. Every nonempty subset of A has a unique least element.

2. A has a unique least element.

Well-ordered induction generalises the principle of mathematical induction to any set A which maybe enumerated as a sequence [ai | i ∈ N] indexed by the non-negative integers. There is an implicittotal ordering in the elements such that ai ≤ aj if and only if i ≤ j in the integers.Hence anyproperty P (ai) of elements of the set may be regarded as a property Q(i) of the index of the elementin the set. What makes induction applicable here is the fact that the set N is “well-ordered” (seedefnition 0.54) by the relation ≤. Hence were a property P to fail for some element aj in the set A,




there would be a least element ai ≤ aj for which it would fail and which in turn translates to theproperty Q(j) and Q(i) failing in such a fashion that for all (if any) i′ < i, Q(i′) would hold.

Theorem 0.56 The Principle of Well-ordered Induction. Let 〈A,≤〉 be a well-ordered set. LetX ⊆ A such that

Basis . The (unique) least element of A is in X and

Induction Step . For each a ∈ A, for every a′ ∈ A, a′ < a implies a′ ∈ X ,

Then X = A

Proof: Suppose X 6= A. Then B = A −X 6= ∅. Since B ⊆ A, and A is totally ordered, B has aunique smallest element b ∈ B. But b 6∈ X , even though for every element a < b, a ∈ X . But bythe induction step b ∈ X , which is a contradiction. QED

Note that in the statement of the above theorem, the basis is included (vacuously) in the inductionstep. Hence the basis statement in this principle is actually superfluous.

Exercise 0.2

1. Prove that version 1, 2 and 3 of PMI are mutually equivalent.




2. Prove that 2+ = Z ∪ I ∪ T in example 0.52.

3. Find the fallacy in the following proof by PMI. Rectify it and again prove using PMI.Theorem:

1

1 ∗ 2+

1

2 ∗ 3+ . . . +

1

(n− 1)n=

3

2∗ 1

nProof: For n=1 the LHS is 1/2 and so is RHS. assume the theorem is true for some n > 1. Wethen prove the induction step.

LHS =1

1 ∗ 2+

1

2 ∗ 3+

1

(n− 1)n+ . . . +

1

n(n + 1)

=3

2− 1

n+

1

n(n + 1)

=3

2− 1

n+

1

n− 1

(n + 1)

=3

2− 1

n(n + 1)which is required to be proved.

4. Let A be any set with a reflexive and transitive binary relation defined on it. That is to say,⊆ A x A satisfies the following conditions.




(a) For every a ∈ A, a a.

(b) For all a, b, c in A, a b and b c implies a c.

Then show by induction that ;R =n;R for all n ≥ 1.

5. Show using PSI that T is well defined.

6. The language of numerals in the arabic notation may be defined by the following simple gram-mard ::= 0|1|2|3|4|5|6|7|8|9n ::= d|ndDefine the value of a string in this language, so that it conforms to the normally accepted mean-ing of a numeral

7. For the language of propositions we may also call the function τ a state, and look upon thefunction T as defining the truth value of a proposition in a given state τ . Now redefine themeaning of a proposition as the set of states in which the proposition has a truth value of 1. if Σ

denotes the set of all possible states i.e. Σ = τ |τ : B → 0, 1 then

(a) What is the domain and the range of the function ϕ?

(b) Define the function ϕ by structural induction.




(c) Prove using the principle of structural induction that for any proposition p, and a state τ ,T (p) = 1 if and only if tau belongs to the ϕ-meaning of p.

8. Let A be a language ,inductively defined by B, K, U. Define the set of syntax tree, TA of theelements of A as follows:

(a) For each b ∈ B, there is a single node labelled b,(b) For each n-ary operator and a1, . . . , an, a ∈ A, if(a1, . . . , an) = a, the syntax tree t of a

is a tree with root labelled by and t1, . . . , tn as the immediate subtree of t, where t1, . . . , tnare the syntax trees corresponding to a1, . . . , an respectively.

(a) Prove that every element of A has a unique syntax tree if A is free.(b) Give an example to show that every syntax tree need not define a unique element of A.

9. Let L0 be the language of propositional logic as defined in the last example. Then intuitivelyspeaking, a propositional formula p is a subformula of a propositional formula q if the syntaxtree of p is a subtree of the syntax tree of q.

(a) Define the notion of subformula inductively.(b) Let of every formula q, SF(q) denote the set of all subformulas of q. Define SF(q) inductively.(c) Let p q if and only if p is a subformula of q. Prove that is a partial order on L0.







1. Lecture 1: Introduction

Lecture 1: IntroductionTuesday 26 July 2011




1. What is Logic?

2. Reasoning, Truth and Validity

3. Examples

4. Objectivity in Logic

5. Formal Logic

6. Formal Logic: Applications

7. Form and Content

8. Facets of Mathematical Logic

9. Logic and Computer Science




What is Logic?• Logic is about reasoning:

– validity of arguments– consistency among sets of statements– matters of truth and falsehood

• Logic is concerned only about the form of reasoning and notabout the content




Reasoning, Truth and Validity•Reasoning is sound only if it is impossible to draw false con-

clusions from true premises• Sound reasoning can however produce false conclusions

from false premises.• Sound reasoning can also produce true conclusions from

false premises.




ExamplesExample 1.1All humans live forever.Socrates is human.Hence Socrates lives forever.

Example 1.2All humans are born with opposable toesBy adulthood opposable toes become non-opposableJohn is an adult humanHence John has no opposable toes.




Objectivity in Logic• The traditional logic of Aristotle and Leibniz are essentially

philosophical in nature with its main purpose being to inves-tigate the objective laws of thought.•Objectivity implies essentially that arguments must be com-

municable to and verifiable by other people.•Objectivity has always implied formalizability.




The colours and shades of Logic.The name mathematical logic may be interpreted in two ways. We may interpret logic as a subjecttreated using mathematical methods. We may just as well, think of it as a subject which formalisesthe methods of reasoning used in mathematics. This leads us apparently to a paradox. How canlogic be treated mathematically without using logic itself?

We resolve this paradox as follows. We simply separate out the logic that we are studying byexpressing it formally through a language – the object language or the target language – from thelogic that is used in reasoning about it. The latter logic that is used for reasoning is called themeta-language.

We will define the object language formally via a grammar (and for good measure we also colour-code the sentences of the object language in green. The meta-language however, is the usual “lan-guage of mathematics” – a mixture of natural language with mathematical symbols that is normallyused in mathematical texts. The words of these sentences will usually appear in black and sometimes in other colours such as blue (e.g for emphasis) or red (e.g. when we want some concept oridea to stand out). However since the objects of our study are green, the objects when appearing inthese sentences will still be green.

When treating any object language formally, it also becomes necessary to specify the meanings




of phrases, expressions and sentences in the formal language. Since we will be expressing thesemeanings through objects in mathematical structures, these objects will be coloured brown.

The study of logic promises to be pretty colourful, what?




Formal Logic• Logic as a formal language with a syntax to which a semantics

had to be attached.• In turn using mathematical methods within logic, led to its

formalization as mathematical logic• The strict separation of syntax from semantics made logic

a clean and elegant mathematical discipline which could bethen applied to the foundational questions of mathematics.




Formal Logic: Applications•Of great use in analysing questions concerning the founda-

tions of mathematics• In clarifying reasoning mechanisms within mathematics.• Identifying various occurrences of circular reasoning which

were previously very hard to identify.




Form and Content• The separation of syntax and semantics also led to the sep-

aration of form and content and• allowed the formalization of correct methods of reasoning

purely in terms of the syntax.• allowed the possibility of plugging in a semantics as de-

manded by an application whenever it was necessary• allowed the possibility of mechanizing reasoning completely.




Facets of Mathematical Logic1. A formalization language for mathematics,2. A calculus clarifying the notion of a mathematical proof3. mechanization of proof4. A sub-discipline of mathematics itself5. A mathmatical tool applicable to various branches of mathe-

matics.




Logic and Computer Science1. Mechanization of reasoning within a formalized syntactic

setup.2. Applications to program and system specification and verifi-

cation.3. As a declarative programming language4. Proving theorems within areas of mathematics where the

number of cases is too high or the details of proof are toolong and tedious.










2. Lecture 2: Propositional Logic Syntax

Lecture 2: Propositional Logic SyntaxWednesday 27 July 2011




1. Truth and Falsehood: 1

2. Truth and Falsehood: 2

3. Extending the Boolean Algebra

4. Table of Truth & Falsehood

5. Sums & Products

6. Propositional Logic: Syntax

7. Propositional Logic: Syntax - 2

8. Natural Language equivalents

9. Some Remarks

10. Associativity and Precedence

11. Syntactic Identity

12. Abstract Syntax Trees

13. Subformulae

14. Atoms in a Formula

15. Degree of a Formula

16. Size of a Formula

17. Height of a Formula




Truth and Falsehood: 1Our notion of meaning is restricted to absolute “truth” and ab-solute “falsehood” (with nothing else in between) of statements.

Definition 2.1 Let B = 〈2, , .,+, ≤,=〉 be the 2-elementboolean algebra where 2 = 0, 1.•We use 1 to denote absolute “truth” and 0 to denote absolute

“falsehood” and• the boolean operations have their usual meaning in the

booleans.




Truth and Falsehood: 2In other words, for any a, b ∈ 2,• a = 1− a is the unary boolean inverse operation,• a + b = max(a, b) is the binary summation operation which

yields the maximum of two boolean values regarded as nat-ural numbers,• a.b = min(a, b) is the binary product operation which yields

the minimum of two boolean values regarded as natural num-bers,• a ≤ b, and a = b denote the usual binary relations “less-than-

or-equal-to” and “equals” on natural numbers restricted to theset 0, 1.




Extending the Boolean AlgebraWhile ≤ and = are binary relations, it is possible to define cor-responding binary operations as shown below.

Definition 2.2 For a, b ∈ 0, 1 define

a ≤. b =

1 if a ≤ b0 otherwise a

.= b =

1 if a = b0 otherwise




Table of Truth & Falsehood

a a0 11 0

a b a.b a + b a ≤. b a .= b

0 0 0 0 1 10 1 0 1 1 01 0 0 1 0 01 1 1 1 1 1




Boolean identities

Negation a = a

Comparison a ≤. b = a + b

Equality a.

= b = (a ≤. b).(b ≤. a)

a + 0 = a Identity a.1 = a

a + 1 = 1 Zero a.0 = 0

a + a = a Idempotence a.a = a

a + b = b + a Commutativity a.b = b.a

(a + b) + c = a + (b + c) Associativity (a.b).c = a.(b.c)

a + (b.c) = (a + b).(a + c) Distributivity a.(b + c) = (a.b) + (a.c)

a + b = a.b De Morgan a.b = a + b

a + a = 1 Simplification a.a = 0

0 = 1 Inversion 1 = 0

a + (a.b) = a Absorption a.(a + b) = a




Sums & ProductsThe Associativity and Commutativity identities allow us to de-fine summations and products over sets of boolean values.∑

0<i≤n aidf= a1 + · · · + an∏

0<i≤n aidf= a1. · · · .an




2.1. Propositions in Natural Languages

We now begin our study of formal logic by studying statements in natural lanaguage. We willfrequently appeal to reasoning methods employed in sentences (actually statements) in natural lan-guages. We initially restrict our study to statements expressed in natural languages. This logic isoften called propositional or sentential logic.

In general sentences in any natural language may be classified into the following forms (usuallyindicated by the mood of the sentence. A standard text book on English grammar defines a moodas the mode or manner in which the action denoted by the verb is represented. It further illustratesthree moods in the English language. English Grammar 101 classifies the verbs in the Englishlanguage into four moods (including the infnitive as a mood. But to get really confounded andconfused, the reader needs to refer to Wikipedia where several moods are defined (e.g. optative,jussive, potential, inferential).

The following example sentences are taken from the references above.

Indicative Mood: expresses an assertion, denial, or question.


http://www.dailywritingtips.com/english-grammar-101-verb-mood/

http://en.wikipedia.org/wiki/Grammatical_mood



Little Rock is the capital of Arakansas.Ostriches cannot fly.Have you finished your homework?

Imperative Mood: expresses command, prohibition, entreaty, or advice.

Dont smoke in this building.Be careful!Have mercy upon us.Give us this day our daily bread.

Subjunctive Mood: expresses a doubt, a wish or an improbability

God bless you!I wish I knew his name.I would rather be paid by cheque.He walks as though he were drunk.

Loosely speaking natural language sentences which are assertions or denials in the indicative moodmay be considered propositions. Questions however, cannot be considered propositions. Moreaccurately only those sentences to which one might (at least theoretically) ascribe a truth value




(such as assertions and denials) may be considered to be propositions in our setting. Hence of theexamples given above the only ones which are of interest to us would be

Little Rock is the capital of Arakansas.

(which is an assertion) and

Ostriches cannot fly.

which is a denial (it denies the statement Ostriches can fly). The last indicative statement

Have you finished your homework?

is a question for which no truth value can be assigned. It is therefore not a proposition in the sensethat we understand propositions. Notice that a denial is an assertion too – it is simply the negation ofan assertion and hence is a statement to which a truth value may be assigned. We will treat denialsalso as assertions and declare that assertions in natural language are all propositions in our sense ofthe term.




The examples that we have considered above are rather simple. We could consider more examplesof assertions, denials and complex assertions which are made up of assertions and denials. Here area few.

• God is in his Heaven and all is right with the world.

• Time and tide wait for no man.

• You can fool some people some of the time, some people all of the time and all the peoplesome of the time, but you cannot fool all the people all the time.

• Anybody who becomes the Prime Minister has a clear national and international agenda.




Propositional Logic: SyntaxDefinition 2.3•A: a countably infinite collection of propositional atoms.• Ω0 = ⊥,>,¬,∧,∨,→,↔: the set of operators (also called

connectives) disjoint from A

• Each operator has a fixed arity defined by α with– α(⊥) = α(>) = 0

– α(¬) = 1 and– α() = 2, for ∈ ∧,∨,→,↔.•A and Ω are disjoint and parentheses ( and ) of various kinds

are used for grouping.• P0 is the smallest set generated from A and Ω0.




Propositional Logic: Syntax - 2Alternatively, we may define the language by the followinggrammar.Definition 2.4

φ, ψ ::= ⊥ | > | p ∈ A | (¬φ) | (φ ∧ ψ) | (φ ∨ ψ) | (φ→ ψ) | (φ↔ ψ)

Each expression of this language is also called a well-formedformula (wff) (or sentence) of P0. Each atom is a simple formulaor sentence. Each sentence having one or more occurrencesof unary or binary operators is called a compound formula. P0is the set of all sentences.




Natural Language equivalentsThe (unary and binary) operators of the language P0 are cho-sen to denote constructs that allow the construction of com-pound statements from simple ones. The following table showsthe intended meaning of each of the operators in the Englishlanguage.

Operator Name English rendering⊥ bottom false> top true¬ negation not∧ conjunction and∨ disjunction or→ conditional if...then↔ biconditional if and only if




2.2. Translation of Natural Language Statements

We may translate natural language sentences into propositions by identifying the simplest sentencesas atoms and connecting up the simple sentences using the connectives at our disposal. Naturally,many of the subtleties of the use of the connectives in natural lanaguages would be lost in translation.For the purposes of logical reasoning, the table given in Natural Language equivalents is sufficientfor propositional reasoning, if we ignore the subtleties in natural language such as tonal variations,implied tense and judgmental implications that often come loaded with the sentences. We will havemore to say on this with respect to particular connectives in the following descriptions.

Negation (¬). This connective is used to mean not, it is not the case that and abbreviated negationprefixes such as non- and un-. If the atom A denotes the simple statement I am at work, then(¬A) denotes I am not at work. In certain cases opposites could be translated using negation.For example, if the sentence

This dish is good.

is denoted by the atom B, the sentence

This dish is bad.




and the statement

This dish is not good.

are both translated as (¬B). Going further, the sentence

This dish is not bad.

would be translated as (¬¬B). As we shall see both the statements B and (¬¬B) would beconsidered logically (though not syntactically) equivalent and the subtle difference between theexpressions good and not bad would be lost in translation.

Conjunction (∧). The usual meaning of conjunction as denoting and holds. In addition, the wordsbut, moreover, furthermore and phrases such as in addition would all be considered synonymouswith and. Notice that the sentence

Rama and Seeta went to the forest

would be considered synonymous with the sentence

Rama went to the forest and Seeta went to the forest

even though the first sentence has an implied meaning of “togetherness” or “simultaneity”.




The operator ∧ is commutative as we shall see later. Therefore for any formulae φ and ψ, φ ∧ ψwould be logically equivalent to ψ ∧ φ. Hence the implied difference between the two sentences(from [7])

Jane got married and had a baby.

and

Jane had a baby and got married.

is usually lost in translation.

Disjunction (∨). The usual meaning is an or in the inclusive sense although or is often used inEnglish in the exclusive sense. Hence φ ∨ ψ would have to be translated to mean either φ or ψor both φ and ψ.In our natural language arguments we will use or in the inclusive sense. either φ or ψ wouldrefer to an or that is used in the exclusive sense. That is either φ or ψ would mean that exactlyone of the two propositions holds and both cannot hold at the same time. Therefore the sentence

Ram or Shyam topped the class.

which could be equivalently rendered in English as a compound sentence




Ram topped the class or Shyam topped the class.

Neither of the above sentences rules out the possibility that both Ram and Shyam topped theclass, whereas

Either Ram or Shyam topped the class.

would mean that exactly one of them could have topped the class.Analogously one might ask what is the correct rendering of the the sentence

Neither Ram nor Shyam topped the class.

If r denotes the atomic statement Ram topped the class and s denotes the atomic statementShyam topped the class, then (¬r) ∧ (¬s) would be an accurate propositional rendering of thesentence.

Conditional (→). Also called the material conditional, it comes pretty close to the English condi-tional statement of the form “If φ then ψ”. Alternative translations are “ψ only if φ”, “ψ providedφ” and “ψ whenever φ”.There are other conditionals that we frequently use in English such as “ψ unless φ” which maybe rendered as “ψ if not φ” and would be represented by the formula ((¬φ)→ ψ).




Biconditional (↔) The translation “If φ then ψ, not otherwise” is reserved for the biconditional.Alternative translations for φ↔ ψ are “φ if and only if ψ”, “φ iff ψ”, “φ exactly when ψ” and“φ just in case ψ”.

The following tables adapted from [7] summarise the various English renderings of each operatorgiven arbitrary operands φ and ψ.

Operation English rendering¬φ not φ

φ does not holdIt is not the case that φ holds




Operation English renderingφ ∧ ψ φ and ψ

Both φ and ψφ but ψNot only φ but ψφ although ψφ despite ψφ yet ψφ while ψ

Operation English renderingφ ∨ ψ φ or ψ

φ or ψ or bothφ and/or ψφ unless ψφ except when ψ




Operation English renderingφ→ ψ If φ then ψ

ψ if φφ only if ψWhen φ then ψψ when φφ only when ψIn case φ then ψψ in case φψ provided φφ is a sufficient condition for ψψ is a necessary condition for φ




Operation English renderingφ↔ ψ φ if and only if ψ

φ iff ψφ if ψ and ψ if φIf φ then ψ, and converselyφ exactly if ψφ exactly when ψφ just in case ψφ is a necessary and sufficient condition for ψ




Some Remarks• It is convenient to have a syntactic symbol for “absolute truth”

and “absolute falsehood” though it is strictly not necessary.• ⊥ and > are constants and hence have no precedence asso-

ciated with them.• In a formula of the form φ→ ψ, φ is called the antecedent andψ the consequent.




2.3. Associativity and Precedence of Operators

Notice that we have defined the language to be fully parenthesized i.e. every compound formulais enclosed in a pair of parentheses ((· · · )). However it may actually be very distracting and con-fusing to read formulae with too many parentheses. We will define precedence and associativityconventions to reduce the number of parentheses while reading and writing formulae.

Associativity conventions. This convention refers to the consecutive occurrences of the same binaryoperator. Simple examples from school arithmetic are used to illustrate the conventions used indisambiguating expressions which are not fully parenthesized.

Left Associativity is the convention that an expression on numbers like 6− 3− 2 should beread as ((6− 3)− 2) (which would yield the value 1). It should not be read as (6− (3− 2))

(which would yield the value 5). Here we say that the subtraction operation associates tothe left or that subtraction is a left associative operation. Other binary operations such asaddition, multiplication and division on numbers are also left associative.

Right Associativity is the convention used to group consecutive occurrences of powers of num-bers. For example 432

is to be read as (4(32)) which would yield the result 49 = 262144. Itshould not be read as ((43)2) which would yield the result 642 = 4096. We say that exponen-




tiation is right associative.

Precedence of operators If two different binary operators occur consecutively in an expression, weneed some way to associate and group them unambiguously. This is usually specified for bi-nary operators by a precedence relation. In school arithmetic this usually takes the form of the“bodmas” rule specifying that brackets have the highest precedence followed by division andmultiplication which in turn are followed by addition and subtraction. Hence for example anexpression such as 3× 4 + 5 is to be read as ((3× 4) + 5) representing the value 17. It would bewrong to read 3× 4 + 5 as (3× (4 + 5)) (representing the value 3× 9 = 27) since multiplica-tion has a higher precedence than addition or multiplication precedes addition. This is usuallydenoted × ≺ +.Similarly the expression 3 + 4× 5 is to be read as (3 + (4× 5)) yielding the value 3 + 20 = 23

and it would be wrong to read it as ((3 + 4)× 5) (which represents the value 7× 5 = 35).It is the usual convention in mathematical texts that unless otherwise specified, unary operatorshave a higher precedence than binary operators.

Precedence of operators with equal precedence When two distinct binary operators have equalprecedence (e.g. addition and subtraction on numbers) then our convention dictates that in theabsence of parentheses, the left operator has a higher precedence in the expression than the one




on the right i.e. they should be grouped from left to right. Thus 5 + 4− 3 is to be read as((5 + 4)− 3) yielding 6 (even though reading it as (5 + (4− 3)) yields the same result). Simi-larly 5− 4 + 3 should be read as ((5− 4) + 3) (yielding the result 4) rather than as (5− (4 + 3))

(which yields −2). For example 24/4× 3 would yield 18 by our convention. While this conven-tion is well-established for left associative operators, it is not clear what the convention is whenthere are consecutive occurrences of distinct right associative operators having equal precedence.However in case of any confusion we may always put in enough parentheses to disambiguate theexpression.

Our interest in precedence, however is restricted to being able to translate an expression writtenin linear form unambiguously into its abstract syntax tree. The use of parentheses aids in un-ambiguously defining an unique abstract syntax tree. Even though we write formulae in linearform we will always implicitly assume that they represent the corresponding abstract syntax tree.




Associativity and PrecedenceOur language has been defined to be fully parenthesized.• The binary operators ∧ and ∨ are left associative i.e. a for-

mula written as φ ∧ ψ ∧ χ should be read as ((φ ∧ ψ) ∧ χ).• The binary operators → and ↔, on the other hand are right

associative i.e. a formula written as φ→ ψ → χ should beread as (φ→ (ψ → χ)).• The operator precedence convention we follow is:

↔ ≺ → ≺ ∨ ≺ ∧ ≺ ¬

i.e. ¬ has the highest precedence and↔ has the lowest.




Syntactic Identity• The precedence rules make a lot of parentheses redundant.• Instead of propositions written linearly we will rather think of

propositions as abstract syntax trees (AST).• Any two propositions φ and ψ which have the same AST will

be considered syntactically identical and denoted φ ≡ ψ.• If φ ≡ ψ then φ and ψ differ only in the presence of redundant

parentheses.




Abstract Syntax TreesAbstract syntax trees are rooted directed trees (see definition2.11)Example 2.5 The AST of the formula (¬(p ∧ ¬q) ∨ r).

∨

¬ r

∧

p ¬

q




SubformulaeDefinition 2.6 The set of subformulae of any formula φ is theset SF (φ) defined by induction on the structure of formulae asfollows:SF (⊥) = ⊥SF (>) = >SF (p) = p, for each atom pSF (¬ψ) = SF (ψ) ∪ ¬ψSF (ψ χ) = SF (ψ) ∪ SF (χ) ∪ ψ χ, ∈ ∧,∨,→,↔




Atoms in a FormulaDefinition 2.7 The atoms of a formula is defined by inductionon the structure of formulae.

atoms(⊥) = ⊥atoms(>) = >atoms(p) = p, for each atom patoms(¬φ) = atoms(φ)atoms(ψ χ) = atoms(ψ) ∪ atoms(χ), ∈ ∧,∨,→,↔




Degree of a FormulaDefinition 2.8 The degree of a formula is defined by inductionon the structure of formulae.degree(⊥) = 0degree(>) = 0degree(p) = 0, for each atom pdegree(¬φ) = degree(φ)degree(ψ χ) = 1 + degree(ψ) + degree(χ), ∈ ∧,∨,→,↔




Size of a FormulaDefinition 2.9 The size of a formula is defined by induction onthe structure of formulae.

size(⊥) = 1size(>) = 1size(p) = 1, for each atom psize(¬φ) = 1 + size(φ)size(ψ χ) = 1 + size(ψ) + size(χ), ∈ ∧,∨,→,↔




Height of a FormulaDefinition 2.10 The height of a formula is defined by inductionon the structure of formulae.height(⊥) = 0height(>) = 0height(p) = 0, for each atom pheight(¬φ) = 1 + height(φ)height(ψ χ) = 1 + max(height(ψ), height(χ)), ∈ ∧,∨,→,↔where max is the maximum of two numbers.




Exercise 2.1

1. Prove that the set P0 is countably infinite. Hint: Use the solution given in example 0.47.

2. Let T (P0) be the set of all abstract syntax trees of the language P0. Define a function AST :

P0 −→ T (P0) which for any well-formed formula yields the corresponding unique abstractsyntax tree of the formula.




2.4. Rooted Trees

Definition 2.11

• A directed graph is a pair 〈N,−→〉 consisting of a finite or a countably infinite set N of nodesand a set −→⊆ N ×N of (directed) edges such that for any edge (s, t) ∈−→ (usually denoteds −→ t in infix notation), s is called the source and t the target of the edge. s is called apredecessor of t and t is called a successor of s.

• A (directed) path of length k ≥ 1 in a directed graph is a finite sequence of nodes n0, n1, . . . ,nk−1, nk such that ni −→ ni+1 for each i ∈ 0, . . . , k − 1. In addition if nk = n0 the path iscalled a cycle of length k, A cycle of length 1 is called a self-loop

• A directed graph is called a directed acyclic graph (DAG) if it has no cycles.

• An unordered (directed) tree is a directed acyclic graph such that there is at most one pathbetween any pair of nodes.

• A (rooted directed) tree is a triple 〈N,−→, r〉 such that 〈N,−→〉 is an unordered (directed)tree and r ∈ N is a distinguished node called the root of the tree and satisfying the followingthe properties.




– There exists a function ` : N → N called the level such that `(r) = 0 and for any s −→ t,`(t) = `(s) + 1.

– Every node in N − r has a unique predecessor.

We will be primarily interested in rooted trees and we will simply refer to them as trees.

Notation. We will often represent a tree T specified only by a name and a root node r either asr

T or “upside-down” as T

raccording to convenience.

Facts 2.12

• Every node in N is source or a target of one or more edges,

• The −→ relation is irreflexive (i.e. there is no edge whose source and target are the same node).

• The root node has no predecessor.

• A leaf (node) is a node with no successor.

Definition 2.13 Let T = 〈N,−→, r〉 be a rooted tree.




• T is also called a tree rooted at r.

• T is infinite if N is an infinite set.

• A path in T is a finite or (countably) infinite sequence r = n0 −→ n1 −→ n2 −→ · · · such that(ni, ni+1) ∈−→ for each i ≥ 0.

• A finite path r = n0 −→ n1 −→ n2 −→ · · ·nm is said to have a length m ≥ 0. A path which isnot finite is said to be infinite.

• A branch is a maximal path – it is either infinite or is finite r = n0 −→ n1 −→ n2 −→ · · ·nmsuch that nm is a leaf.

• The successors of a node s ∈ N is the set Succ(s) = t ∈ N | s −→ t and predecessors of anode t ∈ N is the set Pred(t) = s ∈ N | s −→ t.• −→+ is the transitive closure of the −→ relation and −→∗ is the reflexive-transitive closure of−→.

• The descendants of a node n ∈ N is the setDesc(n) = n∪⋃s∈Succ(n)Desc(s) andDesc(n)−

n is the set of proper descendants of n.

• The ancestors of a node n is the set Ances(n) = n ∪⋃s∈Pred(n)Ances(s) and the proper

ancestors of n is the set Ances(n)− n.




• A subtree of a rooted tree T = 〈N,−→, r〉 is a tree T ′ = 〈Desc(n),−→′, n〉 rooted at a noden ∈ Desc(r) such that s −→′ t for s, t ∈ Desc(n) if and only if s −→ t.

• A rooted tree T = 〈N,−→, r〉 may be extended at a leaf n to another tree T ′ = 〈N ′,−→′, r〉by an edge n −→′ n′ provided n′ 6∈ N , N ′ = N ∪ n′ and n is a leaf node of T .

• A rooted tree T = 〈N,−→, r〉may be extended at the root r to another tree T ′ = 〈N ′,−→′, r′〉by an edge r −→′ r′ provided r′ 6∈ N , N ′ = N ∪ r′. T ′ may be denoted

r′

↓r

T

or

T r

↑r′

.

• Two trees T0 = 〈N0,−→0, r0〉 and T1 = 〈N1,−→1, r1〉 may be joined together at a new node rto form a new tree T = 〈N,−→−→, r〉 where N = N0∪N1∪r,−→=−→0 ∪ −→1 ∪r −→r0, r −→ r1. T may be denoted

r

r0

T0 r1

T1

or

T0 r0

T1 r1

r

.




• T = 〈N,−→, r〉 is said to be finitely branching if for each n ∈ N , Succ(n) is a finite set.

Facts 2.14 In any rooted tree T = 〈N,−→, r〉,

1. N = Desc(r)

2. Ances(n) = s ∈ N | s −→∗ n for any n ∈ N .

3. Desc(n) = t ∈ N | n −→∗ t for any n ∈ N .

4. A rooted tree is acyclic i.e. for all nodes n, n 6−→+ n.

Definition 2.15 Let T = 〈N,−→, r〉 be a tree rooted at node r ∈ N . A node n ∈ N is calledinfinitary if Desc(n) is an infinite set otherwise it is called finitary.

Lemma 2.16 In a finitely branching tree, every infinitary node has an infinitary successor.

Proof: If not, then for some infinitary node n, Succ(n) is finite and for each s ∈ Succ(n), Desc(s)is finite which implies Desc(n) = n ∪

⋃s∈Succ(n)Desc(s) which is a finite union of finite sets.

Then Desc(n) would be finite. QED

Lemma 2.17 (Konig’s Lemma) Every finitely branching infinite tree has an infinite path.




Proof: Assume T = 〈N,−→, r〉 is a finitely branching infinite rooted tree which has no infinitepath. Clearly since N = Desc(r) is infinite, r is infinitary. r has an infinitary successor by lemma2.16. Hence there exists a maximal path in T all of whose nodes are infinitary. This path has to beinfinite, otherwise there would be a last node in the path which is infinitary but has no successors,which is impossible. QED

Corollary 2.18 Every infinite tree is infinitely branching or finitely branching with at least one infi-nite path.

Corollary 2.19 (Contrapositive of Konig’s Lemma) A finitely branching tree is finite if and onlyif it has no infinite path.







3. Lecture 3: Semantics of Propositional Logic

Lecture 3: Semantics of PropositionalLogic

Friday 29 July 2011




1. Semantics of Propositional Logic: 1

2. Semantics of Propositional Logic: 2

3. Models and Satisfiability

4. Example: Abstract Syntax trees

5. Tautology, Contradiction, Contingent




Semantics of Propositional Logic: 1Definition 3.1 A truth assignment is a function τ which assignsto each atom a truth value.

τ : A→ 0, 1

The truth value of a proposition φ is defined by induction on thestructure of propositions as a function

T J.Kτ : P0→ 0, 1

We usedf= to denote equality by definition.




Semantics of Propositional Logic: 2T J⊥Kτ

df= 0

T J>Kτdf= 1

T JpKτdf= τ (p) for each atom p

T J¬φKτdf= T JφKτ

T Jφ ∧ ψKτdf= T JφKτ . T JψKτ

T Jφ ∨ ψKτdf= T JφKτ + T JψKτ

T Jφ→ ψKτdf= T JφKτ ≤. T JψKτ

T Jφ↔ ψKτdf= T JφKτ

.= T JψKτ )




Models and SatisfiabilityDefinition 3.2 A truth assignment τ is called a model of a for-mula φ (denoted τ φ), if and only if T JφKτ = 1. τ is said tosatisfy the formula φ.

Definition 3.3 A formula is satisfiable if it has a model. Other-wise it is said to be unsatisfiable. A set S of formulae is satis-fiable if there is a model τ which satisfies every formula in S(denoted τ S).

Fact 3.4 For any finite set S = φi | 1 ≤ i ≤ n of formulae,τ S if and only if τ φ1 ∧ · · · ∧ φn.




Example: Abstract Syntax treesIn the following example we illustrate the evaluation of the truthvalue of the proposition

(((p ∨ r)↔ (p→ q)) ∧ (q ∨ r))

using the semantics to

(((τ (p) + τ (r)).= (τ (p) ≤. τ (q))).(τ (q) + τ (r)))




∧

q

∨↔

∨ →r

p r qp




∧

q

∨↔

∨ →r

p r qp




∧

q

∨↔

∨ →r

p r qp




.

.= +

+ ≤. τ (r)τ (q)

τ (p) τ (r) τ (p) τ (q)




Tautology, Contradiction, ContingentDefinition 3.5 A proposition is said to be atautology or logically valid if it is true under all truth assign-

ments.contradiction or unsatisfiable if it is false under all truth assign-

ments.contingent if it is neither a tautology nor a contradiction

• A formula is satisfiable if it is not a contradiction.• A formula is falsifiable if is not a tautology.




Exercise 3.1

1. Prove that for each truth assignment τ , the function T J.Kτ is a homomorphism between thealgebras P0 = 〈P0,Ω〉 and B = 〈0, 1, , .,+,≤. , .=〉.

2. Prove that if two truth assignments τ and τ ′ are exactly the same for elements in atoms(φ) forany formula φ, then τ φ if and only if τ ′ φ.

3. Any homomorphism h : A → B from an algebra A to another algebra B (thus establishing a1-1 correspondence between the signatures of the two algebras) induces an equivalence relationon A . What is the nature of the equivalence relation =τ induced by T J.Kτ for a truth assignmentτ?







4. Lecture 4: Logical and Algebraic Concepts

Lecture 4: Logical and AlgebraicConcepts

Tuesday 02 August 2011




1. Logical Consequence: 1

2. Logical Consequence: 2

3. Other Theorems

4. Logical Implication

5. Implication & Equivalence

6. Logical Equivalence as a Congruence




Logical Consequence: 1Definition 4.1 A proposition φ ∈ P0 is called a logical conse-quence of a set Γ ⊆ P0 of formulas (denoted Γ |= φ) if any truthassignment that satisfies all formulas of Γ also satisfies φ.

•When Γ = ∅ then logical consequence reduces to logical va-lidity.• |= φ denotes that φ is logically valid.• Γ 6|= φ denotes that φ is not a logical consequence of Γ.• 6|= φ denotes that φ is logically invalid.




Logical Consequence: 2Theorem 4.2 Let Γ = φi | 1 ≤ i ≤ n be a finite set of propo-sitions, and let ψ be any proposition. Then Γ |= ψ if and only if((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ) is a tautology.




Proof: (⇒)Assume ((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ) is not a tautology. Then there exists atruth assignment τ such that

T J((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ)Kτ = 0

iff (T J(. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)Kτ≤. T JψKτ ) = 0

iff (T J(. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)Kτ ) = 1 and T JψKτ = 0

iff T Jφ1Kτ = . . . = T JφnKτ = 1 and T JψKτ ) = 0

which contradicts the notion of logical consequence.

(⇐) Assume ((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ) is a tautology, and suppose Γ 6|= ψ. Then thereexists a truth assignment τ such that

T Jφ1Kτ = . . . = T JφnKτ = 1 and T JψKτ ) = 0

From the previous proof, we obtain

T J((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ)Kτ = 0

from which it follows that ((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn)→ ψ) is not a tautology, contradictingour assumption. QED

The following results may also be proved using the semantics of propositional logic.




Other TheoremsTheorem 4.3 Let Γ = φi | 1 ≤ i ≤ n be a finite set of proposi-tions, and let ψ be any proposition. Then1. Γ |= ψ if and only if |= φ1→ (φ2→ · · · (φn→ ψ) · · · )2. Γ |= ψ if and only if ((. . . ((φ1 ∧ φ2) ∧ φ3) ∧ . . . ∧ φn) ∧ ¬ψ) is a

contradiction.

Corollary 4.4 A formula φ is a tautology iff ¬φ is a contradiction(unsatisfiable).




Logical ImplicationDefinition 4.5 A formula φ logically implies another formula ψ(denoted φ⇒ ψ) iff |= φ→ ψ. ⇒ is called (logical) implication.

Definition 4.6 A formula φ is logically equivalent to another for-mula ψ (denoted φ ⇔ ψ) iff |= φ↔ ψ. ⇔ is called (logical)equivalence.




Implication & EquivalenceFact 4.71. φ⇒ ψ iff φ |= ψ.2. φ⇔ ψ iff φ⇒ ψ and ψ ⇒ φ.3.⇒ is a preordering (reflexive and transitive) relation on P0.4.⇔ is the kernel of⇒ i.e. ⇔ =⇒ ∩ ⇒−1 and is hence indeed

an equivalence relation on P0.




Logical Equivalence as a CongruenceTheorem 4.8 Logical equivalence is a congruence relation onP0 i.e. if φ⇔ ψ then• ¬φ⇔ ¬ψ and• for each ∗ ∈ ∧,∨,→,↔ and every formula χ we have

φ ∗ χ ⇔ ψ ∗ χχ ∗ φ ⇔ χ ∗ ψ




Exercise 4.1

1. Use the semantics of propositional logic to prove theorem 4.3 and corollary 4.4.

2. Prove the facts 4.7.

3. Prove that logical equivalence is indeed a congruence relation on P0.

4. For each truth assignment τ let =τ denote the equivalence defined in exercise 3.1. What is therelationship between⇔ and =τ?

5. We may define the notion of a precongruence for preorders analogously to the notion of congru-ence for equivalences, i.e. a preorder is a precongruence if it is preserved under each operator.Is⇒ a precongruence on P0? If so prove it, otherwise identify the operators which preserve therelation⇒ and for operators which do not preserve the relation⇒ give examples to show thatthey are not preserved.

6. Prove that φ1, · · · , φn |= ψ if and only if for each i, 1 ≤ i ≤ n, φ1, · · · , φi−1, φi+1, · · · , φn,¬ψ |=¬φi.







5. Lecture 5: Identities and Normal Forms

Lecture 5: Identities and NormalForms

Friday 05 August 2011




1. Adequacy

2. Adequacy: Examples

3. Functional Completeness

4. Duality

5. Principle of Duality

6. Negation Normal Forms: 1

7. Negation Normal Forms: 2

8. Conjunctive Normal Forms

9. CNF




Some identities

Negation ¬¬φ⇔ φ

φ ∨ ⊥ ⇔ φ Identity φ ∧ > ⇔ φ

φ ∨ > ⇔ > Zero φ ∧ ⊥ ⇔ ⊥φ ∨ φ ⇔ φ Idempotence φ ∧ φ ⇔ φ

φ ∨ ψ ⇔ ψ ∨ φ Commutativity φ ∧ ψ ⇔ ψ ∧ φ(φ ∨ ψ) ∨ χ ⇔ φ ∨ (ψ ∨ χ) Associativity (φ ∧ ψ) ∧ χ ⇔ φ ∧ (ψ ∧ χ)

φ ∨ (ψ ∧ χ) ⇔ (φ ∨ ψ) ∧ (φ ∨ χ) Distributivity φ ∧ (ψ ∨ χ) ⇔ (φ ∧ ψ) ∨ (φ ∧ χ)

¬(φ ∨ ψ) ⇔ ¬φ ∧ ¬ψ De Morgan ¬(φ ∧ ψ) ⇔ ¬φ ∨ ¬ψφ ∨ ¬φ ⇔ > Simplification φ ∧ ¬φ ⇔ ⊥¬⊥ ⇔ > Inversion ¬> ⇔ ⊥φ ∨ (φ ∧ ψ) ⇔ φ Absorption φ ∧ (φ ∨ ψ) ⇔ φ




AdequacySome Other Important identities are:

φ↔ ψ ⇔ (φ→ ψ) ∧ (ψ → φ) (5)

φ→ ψ ⇔ ¬φ ∨ ψ (6)

Definition 5.1 A set of operators O ⊆ Ω is said to be adequatefor propositional logic, if for every formula in P0 there is a logi-cally equivalent formula using only the operators in O.




Adequacy: ExamplesExample 5.21. From the identities (5) and (6) and the two Simplification iden-

tities it is clear that O = ¬,∧,∨ is an adequate set of oper-ators for P0.

2. Further given that O = ¬,∧,∨ is adequate and using theDe Morgan identity and Negation, we have that

φ ∧ ψ ⇔ ¬¬(φ ∧ ψ)⇔ ¬(¬φ ∨ ¬ψ)

and hence ¬,∨ is an adequate set.3. We may use the other De Morgan identity

φ ∨ ψ ⇔ ¬¬(φ ∨ ψ)⇔ ¬(¬φ ∧ ¬ψ)

to conclude that ¬,∧ is adequate.




Functional CompletenessIt is quite possible that one could extend P0 with new operators(perhaps 3-ary or 4-ary or indeed of any arity) and thus makethe language more expressive.Definition 5.3 A set O of operators for propositional logic isfunctionally complete (also called expressively adequate) if anyformula built up using the operators of O is logically equivalentto a formula using operators only from Ω.

Lemma 5.4 ¬,∧,∨ is a functionally complete set.

Proof




Proof of lemma 5.4.

Proof: By the semantics of propositional logic, every operator of propositional logic correspondsto an operator of the same arity on the boolean set 0, 1. The proof follows from the constructionof truth tables in the boolean algebra B, since every truth table may be expressed using the booleanoperators , .,+.

Let on : 2n −→ 2 be any n-ary operator on boolean values. Typicallya1 · · · an b

0 0 · · · 0 b0... ... ... ... ...... ... ... ... ...... ... ... ... ...

2n − 1 1 · · · 1 b2n−1

there exists a truth table for the function on(a1, · · · , an), which definesfor each possible set of boolean values of the arguments the booleanvalue of on(a1, · · · , an) = b. This truth table consists of 2n rows andn + 1 columns as shown below where b0, . . . , b2n−1 ∈ 2 (we havenumbered the rows by the decimal equivalent of the binary numberthat the bit-vector (a1, . . . , an) denotes in each case). For 0 ≤ i ≤ 2n,1 ≤ j ≤ n, let aij ∈ 2 be the value assigned to variable aj such that

bi = on(ai1, · · · , ain). We may express the function as

on(a1, · · · , an) =∑

0≤r≤2n−1

(∏

1≤j≤narj∗)∗

where for each row r, and each j, 1 ≤ j ≤ n, arj∗ = 1 if arj = 1 and arj

∗ = 0 otherwise. Similarly




the product (∏

1≤j≤naj∗) is inverted if br = 0 otherwise (

∏1≤j≤naj

∗)∗ = 1. Alternatively we have

on(a1, · · · , an) =∑

br=1,0≤r<2n

(∏

arj=1,1≤j≤narj)

QED




DualityDefinition 5.5• Two formulas φ and ψ are called duals of each other if each

can be obtained from the other by simultaneously replacingall occurrences of– ∧ by ∨,– ∨ by ∧,–⊥ by > and–> by ⊥• ∧ and ∨ are duals of each other and• > and ⊥ are duals of each other




Principle of DualityTheorem 5.6 If atoms(φ) = p1, . . . , pn and φ ≡ o(p1, . . . , pn)then

¬φ⇔ o∗(¬p1, . . . ,¬pn)

where o∗ is the dual of o.

Proof: By structural induction and the use of the De Morganand Simplication laws. QED




Negation Normal Forms: 1The adequacy and functional completeness of the set ¬,∧,∨may be used to build normal forms (or standard forms) by usingthe logical equivalences as rewrite rules.Definition 5.7• A literal is an atom (p ∈ A) or its negation (¬p for p ∈ A).

Atoms are called positive literals and their negations arecalled negative literals.• A formula is in negation normal form if it is built up from literals

using only the operators ∨ and ∧.




Negation Normal Forms: 2Lemma 5.8 Every formula in P0 is logically equivalent to one innegation normal form.

Proof: It suffices to consider only formulas containing the op-erators ¬, ∧ and ∨, and for every occurrence of negation topush it inward using the De Morgan identities. QED




Conjunctive Normal FormsDefinition 5.9• A disjunction of literals is a formula δ of the form

δ ≡ λ1 ∨ λ2 ∨ · · · ∨ λm ≡∨

1≤i≤mλi

where m ≥ 0.• A conjunctive normal form is a formula γ of the formδ1 ∧ δ2 ∧ · · · δn where δj for each 1 ≤ j ≤ n is a disjunction ofliterals.

We may analogously define a conjunction of literals and a dis-junctive normal form (DNF).




CNFTheorem 5.10 Every formula in P0 is logically equivalent to aconjunctive normal form.

Proof: It suffices to consider only negation normal forms.In each case use the distributive laws to distribute ∨ over ∧and use the negation law to remove multiple contiguous occur-rences of negations. QED




Exercise 5.1

1. Prove that the following sets are adequate for P0.

(a) →,¬(b) →,⊥

2. Prove that ∧,∨ is not an adequate set.

3. Prove that if O ⊆ Ω then ¬ ∈ O or ⊥ ∈ O.

4. Define BNFs to generate exactly

(a) the set N0 of negation normal forms(b) the set C0 of conjunctive normal forms(c) the set D0 of disjunctive normal forms.

5. Using principles of induction prove that

(a) C0 ⊂ N0 ⊂ P0,(b) D0 ⊂ N0 ⊂ P0,




6. Let φ ≡ o1(p1, . . . , pn) and ψ ≡ o2(p1, . . . , pn) be formulas such that φ⇔ ψ. Then prove that

o1(¬p1, . . . ,¬pn) ⇔ o2(¬p1, . . . ,¬pn)

¬o∗1(p1, . . . , pn) ⇔ ¬o∗2(p1, . . . , pn)







6. Lecture 6: Tautology Checking

Lecture 6: Tautology CheckingTuesday 09 August 2011




1. Arguments

2. Arguments: 2

3. Validity & Falsification

4. Translation into propositional Logic

5. Atoms in Argument

6. The Representation

7. Propositional Rendering

8. The Strategy

9. Checking Tautology

10. Computing the CNF

11. Falsifying CNF




ArgumentsA typical informally stated argument might go as follows:If prices rise, then the poor and the salaried class will be unhappy.

If taxes are increased then the businessmen will be unhappy.

If the poor and the salaried class or the businessmen are unhappy, the Government will not be

re-elected.

Inflation will rise if Government expenditure exceeds its revenue.

Government expenditure will exceed its revenue unless taxes are increased or the Government

resorts to deficit financing or takes a loan from the IMF to cover the deficit.

If the Government resorts to deficit financing then inflation will rise.

If inflation rises, the prices will also rise.

The Government will get reelected.

Therefore the Government will take a loan from the IMF.




Arguments: 2A typical informally stated argument might go as follows:If prices rise, then the poor and the salaried class will be unhappy.

If taxes are increased then the businessmen will be unhappy.

If the poor and the salaried class are unhappy or the businessmen are unhappy,

the Government will not be re-elected.

Inflation will rise if Government expenditure exceeds its revenue.

Government expenditure will exceed its revenue unless taxes are increased or

the Government resorts to deficit financing or takes a loan from the IMF to cover the deficit.

If the Government resorts to deficit financing then inflation will rise.

If inflation rises, the prices will also rise.

The Government will get reelected.

Therefore the Government will take a loan from the IMF.




Arguments in natural language

It turns out that the correctness or otherwise of most arguments depends entirely on the “shapes” ofthe formulae concerned rather than their intrinsic meaning. Take the previous argument. Supposewe uniformly replace the various atoms by say sentences from nursery rhymes as the following tableshows:

Prices rise Mary has a little lambThe poor and . . . Little Bo-Peep loses her sheepTaxes are increased Jack and Jill go up the hillThe businessmen will be unhappy Humpty-Dumpty sits on the wallThe Government will get re-elected I am little teapotInflation will rise Little Jack Horner sits in a cornerGovernment expenditure . . . revenue The boy stands on the burning deckThe Government resorts . . . Wee Willie Winkie runs through the townThe Government takes a loan . . . deficit Eensy Weensy spider climbs up the water spout

Then we get the following ridiculous sounding argument




If Mary has a little lamb, then Little Bo-Peep loses her sheep.If Jack and Jill go up the hill, then Humpty-Dumpty sits on a wall.If Little Bo-Peep loses her sheep or Humpty-Dumpty sits on a wall, then Little Miss Muffet sits ona tuffet.Little Jack Horner sits in a corner if the boy stands on the burning deck.The boy stands on the burning deck unless Jack and Jill go up the hill or Wee Willie Winkie runsthrough the town or Eensy Weensy spider climbs up the water spoutIf Wee Willie Winkie runs through the town then Little Jack Horner sits in a cornerIf Little Jack Horner sits in a corner then Mary has a little lamb.Little Miss Muffet sits on a tuffet.Therefore Eensy Weensy spider climbs up the water spout.

But if the original argument is logically valid then so is the new one, since logical validity dependsentirely on the so called connectives that make up the propositions and their effect on truth values.




Validity & FalsificationThe validity of such arguments involves showing that the con-clusion is a logical consequence of the hypotheses that pre-cede it. The following alternatives exist:

1. Using a truth table.2. Using theorem 4.23. Using one of the parts of theorem 4.3.

If the argument is not valid, then a falsifying assignment needsto be also given.




Translation into propositional LogicBut even after translating the argument into a suitable proposi-tional logic form, it would be quite impossible to verify the valid-ity of the argument by truth table, since the number of “atomic”propositions could be very large.




Atoms in ArgumentThe Argument

va l r i s e = ATOM "Prices rise" ;va l pandsun = ATOM "The poor and ... will be unhappy" ;va l taxes = ATOM "Taxes are increased" ;va l busun = ATOM "The businessmen will be unhappy" ;va l r e e l e c t = ATOM "The Government will be re-elected" ;va l i n f l a t i o n = ATOM "Inflation will rise" ;va l exceeds = ATOM "Government expenditure ... revenue" ;va l d e f f i n = ATOM "The Government resorts ..." ;va l imf = ATOM "The Govt. takes a loan ... deficit" ;

In this case the the truth table would have 29 = 512 rows. Fur-ther, the truth table will use all the atoms, even the irrelevantones.




The Representationdatatype Prop = ATOM of s t r i n g

| NOT of Prop| AND of Prop ∗ Prop| OR of Prop ∗ Prop| IMP of Prop ∗ Prop| EQL of Prop ∗ Prop




Propositional RenderingThe Argument Atoms in Argument

va l hyp1 = IMP ( r i se , pandsun ) ;va l hyp2 = IMP ( taxes , busun ) ;va l hyp3 = IMP (OR ( pandsun , busun ) , NOT( r e e l e c t ) ) ;va l hyp4 = IMP ( exceeds , i n f l a t i o n ) ;va l hyp5 = IMP ( exceeds , NOT(OR( taxes , OR( d e f f i n , imf ) ) ) ) ;va l hyp6 = IMP ( d e f f i n , i n f l a t i o n ) ;va l hyp7 = IMP ( i n f l a t i o n , r i s e ) ;va l hyp8 = r e e l e c t ;va l conc1 = imf ;va l H = [ hyp1 , hyp2 , . . . , hyp8 ] ;va l Arg1 = (H, conc1 ) ;




The StrategyWe need to either show that

Arg1 = IMP (bigAND H, conc1)

is a tautology or can be falsified. Using theorem 4.2 for validityva l bigAND = lef tReduce (AND) ;fun Va l id ( (H, P ) : Argument ) =

i f n u l l (H) then tau to logy (P)e lse tau to logy ( IMP ( bigAND (H) , P ) )

and for falsification we usefun f a l s i f y A r g ( (H, P ) : Argument ) =

i f n u l l (H) then f a l s i f y ( cn f (P ) )e lse f a l s i f y ( cn f ( IMP ( bigAND (H) , P ) ) )




Checking TautologyChecking for tautology crucially involves finding falsifying truthassignments for at last one of the conjuncts in the CNF of theargument.

fun tau to logy2 (P) =l e t va l Q = cnf (P ) ;

va l LL = f a l s i f y (Q)i n i f n u l l ( LL ) then ( t rue , [ ] )

e lse ( fa l se , LL )end




Computing the CNF1. Rewrite implications and equivalences2. Convert to NNF by pushing negation inward3. Distribute OR over AND




cnflistlist.sml

(∗======================== THE SIGNATURE PropLogic ======================s i g n a t u r e PropLogic =

s i ge x c e p t i o n Atom excep t i ond a t a t y p e Prop =

ATOM of s t r i n g |NOT of Prop |AND of Prop ∗ Prop |OR of Prop ∗ Prop |IMP of Prop ∗ Prop |EQL of Prop ∗ Prop

t y p e Argument = Prop l i s t ∗ Propv a l show : Prop −> u n i tv a l showArg : Argument −> u n i tv a l f a l s i f y A r g : Argument −> Prop l i s t l i s tv a l V a l i d : Argument −> boo l ∗ Prop l i s t l i s t

end ;

(∗ P r o p o s i t i o n a l f o r m u l a s ∗ )

======================== THE STRUCTURE PL ====================== ∗ )(∗ s t r u c t u r e PL : PropLogic = ∗ )s t r u c t u r e PL = (∗ Thi s i s f o r debugg ing p u r p o s e s on ly ∗ )s t r u c t

d a t a t y p e Prop =





;

(∗ −−−−−−−−−−−−−−− P r o p o s i t i o n s t o CNFs −−−−−−−−−−−−−−−−−−−−− ∗ )

e x c e p t i o n Atom excep t i on ;fun newatom ( s ) = i f s = "" t h e n r a i s e Atom excep t i on

e l s e (ATOM s ) ;fun drawChar ( c , n ) =

i f n>0 t h e n ( p r i n t ( s t r ( c ) ) ; drawChar ( c , ( n−1) ) )e l s e ( ) ;

fun show ( P ) =l e t fun drawTabs ( n ) = drawChar (#"\t" , n ) ;

fun showTreeTabs (ATOM a , n ) = ( drawTabs ( n ) ;p r i n t ( a ) ;p r i n t ("\n" ))

| showTreeTabs (NOT ( P ) , n ) = ( drawTabs ( n ) ; p r i n t ("NOT" ) ;showTreeTabs ( P , n +1))

| showTreeTabs (AND ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("AND\n" ) ;




showTreeTabs (Q, n +1))

| showTreeTabs (OR ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("OR\n" ) ;showTreeTabs (Q, n +1))

| showTreeTabs ( IMP ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("IMPLIES\n" ) ;showTreeTabs (Q, n +1))

| showTreeTabs (EQL ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("IFF\n" ) ;showTreeTabs (Q, n +1))

;i n ( p r i n t ("\n" ) ; showTreeTabs ( P , 0 ) ; p r i n t ("\n" ) )

end;

(∗ The f u n c t i o n below e v a l u a t e s a f o r m u l a g i v e n a t r u t h a s s i g n m e n t .The t r u t h a s s i g n m e n t i s g i v e n as a l i s t o f atoms t h a t a r e t r u e( a l l o t h e r atoms a r e f a l s e ) .




∗ )

fun lookup ( x : Prop , [ ] ) = f a l s e| l ookup ( x , h : : L ) = ( x = h ) o r e l s e lookup ( x , L );

fun e v a l (ATOM a , L ) = lookup (ATOM a , L )| e v a l (NOT ( P ) , L ) = i f e v a l ( P , L ) t h e n f a l s e e l s e t r u e| e v a l (AND ( P , Q) , L ) = e v a l ( P , L ) a n d a l s o e v a l (Q, L )| e v a l (OR ( P , Q) , L ) = e v a l ( P , L ) o r e l s e e v a l (Q, L )| e v a l ( IMP ( P , Q) , L ) = e v a l (OR (NOT ( P ) , Q) , L )| e v a l (EQL ( P , Q) , L ) = ( e v a l ( P , L ) = e v a l (Q, L ) );

(∗ We c o u l d a l s o w r i t e a t a u t o l o g y c h e c k e r wi th o u t u s i n g t r u t ha s s i g n m e n t s by f i r s t c o n v e r t i n g e v e r y t h i n g i n t o a normal form .

∗ )

(∗ F i r s t r e w r i t e i m p l i c a t i o n s and e q u i v a l e n c e s ∗ )

fun r e w r i t e (ATOM a ) = ATOM a| r e w r i t e (NOT ( P ) ) = NOT ( r e w r i t e ( P ) )| r e w r i t e (AND ( P , Q) ) = AND ( r e w r i t e ( P ) , r e w r i t e (Q) )| r e w r i t e (OR ( P , Q) ) = OR ( r e w r i t e ( P ) , r e w r i t e (Q) )| r e w r i t e ( IMP ( P , Q) ) = OR (NOT ( r e w r i t e ( P ) ) , r e w r i t e (Q) )| r e w r i t e (EQL ( P , Q) ) = r e w r i t e (AND ( IMP ( P , Q) , IMP (Q, P ) ) );




(∗ Conver t a l l f o r m u l a s n o t c o n t a i n i n g IMP or EQL i n t o N e g a t i o n NormalForm .

∗ )

fun nnf (ATOM a ) = ATOM a| nnf (NOT (ATOM a ) ) = NOT (ATOM a )| nnf (NOT (NOT ( P ) ) ) = nnf ( P )| nnf (AND ( P , Q) ) = AND ( nnf ( P ) , nnf (Q) )| nnf (NOT (AND ( P , Q ) ) ) = nnf (OR (NOT ( P ) , NOT (Q ) ) )| nnf (OR ( P , Q) ) = OR ( nnf ( P ) , nnf (Q) )| nnf (NOT (OR ( P , Q ) ) ) = nnf (AND (NOT ( P ) , NOT (Q ) ) );

(∗ D i s t r i b u t e OR ove r AND t o g e t a NNF i n t o CNF ∗ )

fun dis tOR ( P , AND (Q, R ) ) = AND ( dis tOR ( P , Q) , d is tOR ( P , R ) )| dis tOR (AND (Q, R) , P ) = AND ( dis tOR (Q, P ) , d is tOR (R , P ) )| dis tOR ( P , Q) = OR ( P , Q)

(∗ Now t h e CNF can be e a s i l y computed ∗ )

fun c o n j o f d i s j (AND ( P , Q) ) = AND ( c o n j o f d i s j ( P ) , c o n j o f d i s j (Q) )| c o n j o f d i s j (OR ( P , Q) ) = dis tOR ( c o n j o f d i s j ( P ) , c o n j o f d i s j (Q) )| c o n j o f d i s j ( P ) = P;

fun c n f ( P ) = c o n j o f d i s j ( nnf ( r e w r i t e ( P ) ) ) ;




(∗ −−−−−−−−−−−−−−− P r o p o s i t i o n s t o CNFs ends −−−−−−−−−−−−−−−−−−−−− ∗ )

(∗ −−−−−−−−−−−−−−− CNFs t o l i s t s o f l i s t s o f l i t e r a l s −−−−−−−−−−−− ∗ )

(∗ Conver t a c l a u s e i n t o a l i s t o f l i t e r a l s ∗ )fun f l a t t e n O R (OR (A, B ) ) = ( f l a t t e n O R A) @ ( f l a t t e n O R B)| f l a t t e n O R C = [C] (∗ assuming C i s a l i t e r a l ∗ )

(∗ Conver t a CNF i n t o a l i s t o f l i s t s o f c l a u s e s ∗ )fun f la t t enAND (AND (Q, R ) ) = ( f l a t t enAND Q) @ ( f la t t enAND R)| f l a t t enAND ( P ) = [ f l a t t e n O R P ] (∗ assuming P i s a c l a u s e ∗ )

(∗ S o r t t h e l i t L i s t L i s t u s i n g some o r d e r i n g and remove d u p l i c a t e s w h i l e s o r t i n g ∗ )

(∗ D ef in e an o r d e r i n g l i t L e s s on l i t e r a l s : ∗ )

fun l i t L e s s (ATOM ( a ) , ATOM ( b ) ) = a < b (∗ l e x i c o g r a p h i c ∗ )| l i t L e s s (NOT(ATOM a ) , NOT(ATOM b ) ) = a < b

(∗ e v e r y n e g a t i v e l i t e r a l i s s m a l l e r t h a n e v e r y p o s i t i v e l i t e r a l ∗ )| l i t L e s s (NOT(ATOM a ) , ATOM ( b ) ) = t r u e| l i t L e s s (ATOM ( a ) , NOT(ATOM b ) ) = f a l s e

(∗ Extend t h e o r d e r i n g t o l i s t s o f l i t e r a l s ∗ )

fun c l a u s e L e s s ( [ ] , [ ] ) = f a l s e| c l a u s e L e s s ( [ ] , ) = t r u e| c l a u s e L e s s ( , [ ] ) = f a l s e




| c l a u s e L e s s ( h1 : : T1 , h2 : : T2 ) =( l i t L e s s ( h1 , h2 ) ) o r e l s e( ( h1=h2 ) a n d a l s o c l a u s e L e s s ( T1 , T2 ) )

(∗ D ef in e mergeSortRD t o remove d u p l i c a t e s a s s o r t i n g p r o c e e d s ∗ )

fun mergeSortRD R [ ] = [ ]| mergeSortRD R [ h ] = [ h ]| mergeSortRD R L = (∗ can ’ t s p l i t a l i s t u n l e s s i t has > 1 e l e m e n t ∗ )l e t fun s p l i t [ ] = ( [ ] , [ ] )

| s p l i t [ h ] = ( [ h ] , [ ] )| s p l i t ( h1 : : h2 : : t ) =

l e t v a l ( l e f t , r i g h t ) = s p l i t t ;i n ( h1 : : l e f t , h2 : : r i g h t )

end ;v a l ( l e f t , r i g h t ) = s p l i t L ;fun mergeRD (R , [ ] , [ ] ) = [ ]| mergeRD (R , [ ] , L2 ) = L2| mergeRD (R , L1 , [ ] ) = L1| mergeRD (R , ( L1 as h1 : : t 1 ) , ( L2 as h2 : : t 2 ) ) =

i f h1=h2 t h e n mergeRD (R , t1 , L2 ) (∗ remove a copy ∗ )e l s e i f R( h1 , h2 ) t h e n h1 : : ( mergeRD (R , t1 , L2 ) )e l s e h2 : : ( mergeRD (R , L1 , t 2 ) ) ;

v a l s o r t e d L e f t = mergeSortRD R l e f t ;v a l s o r t e d R i g h t = mergeSortRD R r i g h t ;

i n mergeRD (R , s o r t e d L e f t , s o r t e d R i g h t )end ;




(∗ Now s o r t t h e l i s t o f l i s t s o f l i t e r a l s removing d u p l i c a t e s ∗ )

fun sor tRD LL = (∗ F i r s t s o r t each c l a u s e and t h e n t h e l i s t o f c l a u s e s ∗ )l e t v a l s o r t e d C l a u s e s = map ( mergeSortRD l i t L e s s ) LLi n mergeSortRD c l a u s e L e s s s o r t e d C l a u s e send ;

(∗ P u t t i n g e v e r y t h i n g t o g e t h e r ∗ )

fun p r o p 2 l i s t l i s t P = sortRD ( f la t t enAND ( c n f P ) )end (∗ s t r u c t ∗ ) ;

open PL ;

(∗ T e s t i n g p r o p 2 l i s t l i s t ====================================− v a l god = ATOM ” There i s a God ” ;v a l god = ATOM ” There i s a God” : Prop− v a l o s c i e n t = ATOM ”God i s o m n i s c i e n t ” ;v a l o p o t e n t = ATOM ”God i s o m n i p o t e n t ” ;v a l e v i l = ATOM ” There i s E v i l ” ;v a l know = ATOM ”God knows t h e r e i s E v i l ” ;v a l p r e v e n t = ATOM ”God p r e v e n t s E v i l ” ;

v a l hy1 = IMP ( god , AND ( o s c i e n t , o p o t e n t ) ) ;v a l hy2 = IMP ( o s c i e n t , know ) ;v a l hy3 = IMP ( o p o t e n t , p r e v e n t ) ;v a l hy4 = e v i l ;v a l conc = NOT ( god ) ;




v a l o s c i e n t = ATOM ”God i s o m n i s c i e n t ” : Prop− GC # 0 . 0 . 0 . 1 . 7 . 1 8 7 : (1 ms )v a l o p o t e n t = ATOM ”God i s o m n i p o t e n t ” : Prop− v a l e v i l = ATOM ” There i s E v i l ” : Prop− v a l know = ATOM ”God knows t h e r e i s E v i l ” : Prop− v a l p r e v e n t = ATOM ”God p r e v e n t s E v i l ” : Prop− − v a l hy1 = IMP (ATOM ” There i s a God ” ,AND (ATOM # ,ATOM # ) ) : Prop− v a l hy2 = IMP (ATOM ”God i s o m n i s c i e n t ” ,ATOM ”God knows t h e r e i s E v i l ” ) : Prop− v a l hy3 = IMP (ATOM ”God i s o m n i p o t e n t ” ,ATOM ”God p r e v e n t s E v i l ” ) : Prop− v a l hy4 = ATOM ” There i s E v i l ” : Prop− v a l conc = NOT (ATOM ” There i s a God ” ) : Prop− − p r o p 2 l i s t l i s t hy1 ;v a l i t =

[ [NOT (ATOM ” There i s a God ” ) ,ATOM ”God i s o m n i p o t e n t ” ] ,[NOT (ATOM ” There i s a God ” ) ,ATOM ”God i s o m n i s c i e n t ” ] ] : Prop l i s t l i s t

− p r o p 2 l i s t l i s t hy2 ;v a l i t = [ [NOT (ATOM ”God i s o m n i s c i e n t ” ) ,ATOM ”God knows t h e r e i s E v i l ” ] ]

: Prop l i s t l i s t− v a l andhyp = AND ( hy1 , AND ( hy2 , AND( hy3 , hy4 ) ) ) ;v a l andhyp = AND ( IMP (ATOM # ,AND # ) ,AND ( IMP # ,AND # ) ) : Prop− p r o p 2 l i s t l i s t andhyp ;v a l i t =

[ [NOT (ATOM ”God i s o m n i p o t e n t ” ) ,ATOM ”God p r e v e n t s E v i l ” ] ,[NOT (ATOM ”God i s o m n i s c i e n t ” ) ,ATOM ”God knows t h e r e i s E v i l ” ] ,[NOT (ATOM ” There i s a God ” ) ,ATOM ”God i s o m n i p o t e n t ” ] ,[NOT (ATOM ” There i s a God ” ) ,ATOM ”God i s o m n i s c i e n t ” ] ,[ATOM ” There i s E v i l ” ] ] : Prop l i s t l i s t




− v a l a = ATOM ” a ” ;v a l a = ATOM ” a ” : Prop− v a l b = ATOM ” b ” ;v a l b = ATOM ” b ” : Prop− v a l c = ATOM ” c ” ;v a l c = ATOM ” c ” : Prop− v a l one = IMP ( a , OR ( a , a ) ) ;v a l one = IMP (ATOM ” a ” ,OR (ATOM # ,ATOM # ) ) : Prop− v a l two = EQL( b , OR( b , NOT( b ) ) ) ;v a l two = EQL (ATOM ” b ” ,OR (ATOM # ,NOT # ) ) : Prop− v a l t h r e e = EQL(AND( a , a ) , OR(NOT( a ) , OR( a , NOT( a ) ) ) ) ;GC # 0 . 0 . 0 . 1 . 8 . 2 2 1 : (1 ms )v a l t h r e e = EQL (AND (ATOM # ,ATOM # ) ,OR (NOT # ,OR # ) ) : Prop− v a l p 2 l l = p r o p 2 l i s t l i s t ;v a l p 2 l l = fn : Prop −> Prop l i s t l i s t− p 2 l l one ;v a l i t = [ [NOT (ATOM ” a ” ) ,ATOM ” a ” ] ] : Prop l i s t l i s t− p 2 l l two ;v a l i t = [ [NOT (ATOM ” b ” ) ,ATOM ” b ” ] , [ATOM ” b ” ] ] : Prop l i s t l i s t− p 2 l l t h r e e ;v a l i t = [ [NOT (ATOM ” a ” ) ,ATOM ” a ” ] , [ATOM ” a ” ] ] : Prop l i s t l i s t− p 2 l l (OR( one , OR( two , t h r e e ) ) ) ;v a l i t =

[ [NOT (ATOM ” a ” ) ,NOT (ATOM ” b ” ) ,ATOM ” a ” ,ATOM ” b ” ] ,[NOT (ATOM ” a ” ) ,ATOM ” a ” ,ATOM ” b ” ] ] : Prop l i s t l i s t

−=============================================================== ∗ )




Falsifying CNF1. Suffices to find a falsification of at least one conjunct2. A conjunct in the CNF can be false iff all the disjuncts in it are

false.3. A disjunct is false iff it does not contain a “complementary

pair”.

Assume the CNF is Q ≡∧mi=1Di where each Di ≡

∨nij=1Lij

where the literals of Di = Pi ∪ Ni where Pi is the set of posi-tive literals (atoms) and Ni consists of the atoms appearing asnegative literals.

Then Di is false iff Pi ∩Ni = ∅.




tautology1.sml

s i g n a t u r e PropLogic =

s i ge x c e p t i o n Atom excep t i ond a t a t y p e Prop =


t y p e Argument = Prop l i s t ∗ Propv a l show : Prop −> u n i tv a l showArg : Argument −> u n i tv a l f a l s i f y A r g : Argument −> Prop l i s t l i s tv a l V a l i d : Argument −> boo l ∗ Prop l i s t l i s t

end ;

(∗ P r o p o s i t i o n a l f o r m u l a s ∗ )

s t r u c t u r e PL : PropLogic =(∗ s t r u c t u r e PL = ∗ ) (∗ Thi s i s f o r debugg ing p u r p o s e s on ly ∗ )s t r u c t

d a t a t y p e Prop =ATOM of s t r i n g |




NOT of Prop |AND of Prop ∗ Prop |OR of Prop ∗ Prop |IMP of Prop ∗ Prop |EQL of Prop ∗ Prop

;

e x c e p t i o n Atom excep t i on ;fun newatom ( s ) = i f s = "" t h e n r a i s e Atom excep t i on

e l s e (ATOM s ) ;fun drawChar ( c , n ) =

i f n>0 t h e n ( p r i n t ( s t r ( c ) ) ; drawChar ( c , ( n−1) ) )e l s e ( ) ;

fun show ( P ) =l e t fun drawTabs ( n ) = drawChar (#"\t" , n ) ;

fun showTreeTabs (ATOM a , n ) = ( drawTabs ( n ) ;p r i n t ( a ) ;p r i n t ("\n" ))

| showTreeTabs (NOT ( P ) , n ) = ( drawTabs ( n ) ; p r i n t ("NOT" ) ;showTreeTabs ( P , n +1))

| showTreeTabs (AND ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("AND\n" ) ;showTreeTabs (Q, n +1))

| showTreeTabs (OR ( P , Q) , n ) =




( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("OR\n" ) ;showTreeTabs (Q, n +1))

| showTreeTabs ( IMP ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("IMPLIES\n" ) ;showTreeTabs (Q, n +1))

| showTreeTabs (EQL ( P , Q) , n ) =( showTreeTabs ( P , n + 1 ) ;drawTabs ( n ) ; p r i n t ("IFF\n" ) ;showTreeTabs (Q, n +1))

;i n ( p r i n t ("\n" ) ; showTreeTabs ( P , 0 ) ; p r i n t ("\n" ) )

end;

(∗ The f u n c t i o n below e v a l u a t e s a f o r m u l a g i v e n a t r u t h a s s i g n m e n t .The t r u t h a s s i g n m e n t i s g i v e n as a l i s t o f atoms t h a t a r e a s s i g n e d” t r u e ” ( i m p l i c i t l y a l l o t h e r atoms a r e assume d t o have beena s s i g n e d ” f a l s e ” ) .

∗ )




fun lookup ( x : Prop , [ ] ) = f a l s e| l ookup ( x , h : : L ) =

i f ( x = h ) t h e n t r u ee l s e lookup ( x , L )

;

fun e v a l (ATOM a , L ) = lookup (ATOM a , L )| e v a l (NOT ( P ) , L ) = i f e v a l ( P , L ) t h e n f a l s e e l s e t r u e| e v a l (AND ( P , Q) , L ) = e v a l ( P , L ) a n d a l s o e v a l (Q, L )| e v a l (OR ( P , Q) , L ) = e v a l ( P , L ) o r e l s e e v a l (Q, L )| e v a l ( IMP ( P , Q) , L ) = e v a l (OR (NOT ( P ) , Q) , L )| e v a l (EQL ( P , Q) , L ) = ( e v a l ( P , L ) = e v a l (Q, L ) );

(∗ We f i r s t c o n v e r t e v e r y p r o p o s i t i o n i n t o a normal form .∗ )

(∗ F i r s t r e w r i t e i m p l i c a t i o n s and e q u i v a l e n c e s ∗ )

fun r e w r i t e (ATOM a ) = ATOM a| r e w r i t e (NOT ( P ) ) = NOT ( r e w r i t e ( P ) )| r e w r i t e (AND ( P , Q) ) = AND ( r e w r i t e ( P ) , r e w r i t e (Q) )| r e w r i t e (OR ( P , Q) ) = OR ( r e w r i t e ( P ) , r e w r i t e (Q) )| r e w r i t e ( IMP ( P , Q) ) = OR (NOT ( r e w r i t e ( P ) ) , r e w r i t e (Q) )| r e w r i t e (EQL ( P , Q) ) = r e w r i t e (AND ( IMP ( P , Q) , IMP (Q, P ) ) );




(∗ Conver t a l l f o r m u l a s n o t c o n t a i n i n g IMP or EQL i n t o N e g a t i o n NormalForm .

∗ )

fun nnf (ATOM a ) = ATOM a| nnf (NOT (ATOM a ) ) = NOT (ATOM a )| nnf (NOT (NOT ( P ) ) ) = nnf ( P )| nnf (AND ( P , Q) ) = AND ( nnf ( P ) , nnf (Q) )| nnf (NOT (AND ( P , Q ) ) ) = nnf (OR (NOT ( P ) , NOT (Q ) ) )| nnf (OR ( P , Q) ) = OR ( nnf ( P ) , nnf (Q) )| nnf (NOT (OR ( P , Q ) ) ) = nnf (AND (NOT ( P ) , NOT (Q ) ) );

(∗ D i s t r i b u t e OR ove r AND t o g e t a NNF i n t o CNF ∗ )

fun dis tOR ( P , AND (Q, R ) ) = AND ( dis tOR ( P , Q) , d is tOR ( P , R ) )| dis tOR (AND (Q, R) , P ) = AND ( dis tOR (Q, P ) , d is tOR (R , P ) )| dis tOR ( P , Q) = OR ( P , Q)

(∗ Now t h e CNF can be e a s i l y computed ∗ )

fun c o n j o f d i s j (AND ( P , Q) ) = AND ( c o n j o f d i s j ( P ) , c o n j o f d i s j (Q) )| c o n j o f d i s j (OR ( P , Q) ) = dis tOR ( c o n j o f d i s j ( P ) , c o n j o f d i s j (Q) )| c o n j o f d i s j ( P ) = P;

fun c n f ( P ) = c o n j o f d i s j ( nnf ( r e w r i t e ( P ) ) ) ;




(∗ A p r o p o s i t i o n i n CNF i s a t a u t o l o g yi f f

Every c o n j u n c t i s a t a u t o l o g yi f f

Every d i s j u n c t i n e v e r y c o n j u n c t c o n t a i n s bo th p o s i t i v e and n e g a t i v el i t e r a l s o f a t l e a s t one atom

So we c o n s t r u c t t h e l i s t o f a l l t h e p o s i t i v e and n e g a t i v e atoms i n e v e r yd i s j u n c t t o check whe the r t h e l i s t s a r e a l l e q u a l . We need a b i n a r yf u n c t i o n on l i s t s t o d e t e r m i n e whe the r two l i s t s a r e d i s j o i n t

∗ )

fun i s P r e s e n t ( a , [ ] ) = f a l s e| i s P r e s e n t ( a , b : : L ) = ( a = b ) o r e l s e i s P r e s e n t ( a , L );

fun d i s j o i n t ( [ ] , M) = t r u e| d i s j o i n t ( L , [ ] ) = t r u e| d i s j o i n t ( L as a : : LL , M as b : :MM)=

n o t ( i s P r e s e n t ( a , M) ) a n d a l s on o t ( i s P r e s e n t ( b , L ) ) a n d a l s od i s j o i n t ( LL , MM)

;

(∗ ABHISHEK : D e f i n i n g a t o t a l o r d e r i n g on atoms ( l e x i c o g r a p h i co r d e r i n g on u n d e r l y i n g s t r i n g s ) , and e x t e n d i n g i t t o a l i s t o f atoms .

∗ )




e x c e p t i o n notAtom ;

fun atomLess ( a , b ) = c a s e ( a , b ) o f(ATOM( x ) , ATOM( y ) ) => x<y

| ( , ) => r a i s e notAtom ;

fun l i s t L e s s ( a , b ) = c a s e ( a , b ) o f( , [ ] ) => f a l s e

| ( [ ] , ) => t r u e| ( x : : lx , y : : l y ) => i f a tomLess ( x , y ) t h e n t r u e

e l s e i f a tomLess ( y , x ) t h e n f a l s ee l s e l i s t L e s s ( lx , l y ) ;

(∗ ABHISHEK : Once we have a l i s t o f f a l s i f i e r s , we would want t o removeany d u p l i c a t i o n , f i r s t l y o f atoms w i t h i n a f a l s i f i e r , and s e c o n d l y o ff a l s i f i e r s t h e m s e l v e s .

In o r d e r t o do t h i s , we m a i n t a i n a l l l i s t s i n some s o r t e d o r d e r .I n s t e a d o f s o r t i n g a l i s t w i th a p o s s i b l y l a r g e number o f d u p l i c a t e s ,we check f o r d u p l i c a t e s w h i l e i n s e r t i n g , and omi t i n s e r t i o n i f ap r e v i o u s i n s t a n c e i s d e t e c t e d .

∗ )

fun merge l e s s ( [ ] , l 2 ) = l 2| merge l e s s ( l1 , [ ] ) = l 1| merge l e s s ( x : : l1 , y : : l 2 ) =

i f l e s s ( x , y ) t h e n x : : merge l e s s ( l1 , y : : l 2 )




e l s e i f l e s s ( y , x ) t h e n y : : merge l e s s ( x : : l1 , l 2 )e l s e merge l e s s ( x : : l1 , l 2 ) ;

(∗ ABHISHEK : Claim i s t h a t i f a l l l i s t s a r e b u i l t t h r o u g h t h e abovef u n c t i o n , t h e n t h e r e i s no need t o s o r t o r remove d u p l i c a t e s .

Hence a l l ’@’ o p e r a t i o n s have been r e p l a c e d by merge .∗ )

e x c e p t i o n not CNF ;

fun p o s i t i v e s (ATOM a ) = [ATOM a ]| p o s i t i v e s (NOT (ATOM ) ) = [ ]| p o s i t i v e s (OR ( P , Q) ) = merge atomLess ( p o s i t i v e s ( P ) , p o s i t i v e s (Q) )| p o s i t i v e s ( P ) = r a i s e not CNF;

fun n e g a t i v e s (ATOM ) = [ ]| n e g a t i v e s (NOT (ATOM a ) ) = [ATOM a ]| n e g a t i v e s (OR ( P , Q) ) = merge atomLess ( n e g a t i v e s ( P ) , n e g a t i v e s (Q) )| n e g a t i v e s ( P ) = r a i s e not CNF;

(∗ Check whe the r a f o r m u l a i n CNF i s a t a u t o l o g y ∗ )

fun t a u t (AND ( P , Q) ) = t a u t ( P ) a n d a l s o t a u t (Q)| t a u t ( P ) = (∗ i f i t i s n o t a c o n j u n c t i o n t h e n i t must be a d i s j u n c t ∗ )

n o t ( d i s j o i n t ( p o s i t i v e s ( P ) , n e g a t i v e s ( P ) ) )




;

fun t a u t o l o g y 1 ( P ) =l e t v a l Q = c n f ( P )i n t a u t (Q)end

;

(∗ The main problem wi th t h e above i s t h a t i t ch ec ks whe the r a g i v e np r o p o s i t i o n i s a t a u t o l o g y , b u t whenever i t i s not , i t does n o t y i e l da f a l s i f y i n g t r u t h a s s i g n m e n t . We r e c t i f y t h i s problem below .

∗ )

(∗F i r s t l y , a s i n t h e c a s e o f t h e f u n c t i o n lookup , we w i l l assume a t r u t ha s s i g n m e n t i s a l i s t o f atoms which a r e a s s i g n e d t h e t r u t h v a l u e ” t r u e ”and t h a t any atom t h a t i s n o t p r e s e n t i n t h e l i s t has been a s s i g n e d” f a l s e ” .

Assume Q i s a p r o p o s i t i o n i n CNF . Then i t i s on ly n e c e s s a r y t o l i s t o u ta l l t h e l i s t s o f t r u t h a s s i g n m e n t s t h a t can f a l s i f y Q.

Suppose Q i s i n CNF, b u t n o t n e c e s s a r i l y a t a u t o l o g y . F u r t h e r l e t

Q = AND ( D1 , . . . , Dn )

where each Di i s a d i s j u n c t i o n o f l i t e r a l s . Each Di = P i + Ni where




Pi and Ni a r e t h e l i s t s o f atoms d e n o t i n g t h e p o s i t i v e and n e g a t i v el i t e r a l s r e s p e c t i v e l y .

Q would be ” f a l s i f i e d ” i f a t l e a s t one o f t h e Di can be made f a l s e . Dican be made f a l s e on ly i f i t does n o t c o n t a i n a ” complementa ry p a i r ” ,i . e . t h e r e e x i s t s no atom a such t h a t bo th a and ˜ a o c c u r i n Di . Hencef o r Di t o be f a l s i f i e d i t i s n e c e s s a r y t h a t t h e l i s t s P i and Ni a r ed i s j o i n t ( i f t h e r e i s no atom common t o P i and Ni , t h e r e i s no” complementary p a i r ” i n Di .

S i n c e Di i s a d i s j u n c t i o n o f l i t e r a l s , i t can be f a l s i f i e d on ly bya s s i g n i n g e v e r y l i t e r a l i n Di t h e v a l u e ” f a l s e ” . Th i s can be done on lyby a s s i g n i n g a l l t h e atoms i n P i t h e v a l u e ” f a l s e ” and a l l t h e atomsi n Ni t h e v a l u e ” t r u e ” .

In o t h e r words , i f P i and Ni a r e d i s j o i n t , t h e n Ni i s a t r u t ha s s i g n m e n t which f a l s i f i e s t h e p r o p o s i t i o n Q. We r e f e r t o Ni as aFALSIFIER of Q.

T h e r e f o r e t h e FALSIFIERS of Q a r e e x a c t l y t h e l i s t o f n e g a t i v e atomsof each d i s j u n c t which does n o t c o n t a i n a complementary p a i r . Byc h e c k i n g each d i s j u n c t i n Q we may l i s t o u t ALL t h e p o s s i b l eFALSIFIERS of Q.

I f Q has no FALSIFIER t h e n no d i s j u n c t Di can be made f a l s e i . e . e v e r yd i s j u n c t does i n d e e d have a compementary p a i r . We may t h e n c o n c l u d et h a t Q i s a t a u t o l o g y .

∗ )




(∗ The f o l l o w i n g f u n c t i o n assumes Q i s i n CNF and o u t p u t s a l i s t o f l i s to f atoms t h a t can f a l s i f y Q. I f t h i s l i s t o f l i s t o f atoms i s empty t h e nc l e a r l y Q i s a t a u t o l o g y .

∗ )

fun f a l s i f y (Q) =l e t fun l i s t F a l s i f i e r s (AND (A, B ) ) =

merge l i s t L e s s ( l i s t F a l s i f i e r s (A) , l i s t F a l s i f i e r s (B ) )| l i s t F a l s i f i e r s (A) = (∗ Assume A i s a d i s j u n c t o f l i t e r a l s ∗ )

l e t v a l PLA = p o s i t i v e s (A) (∗ no un iq r e q u i r e d ∗ )v a l NLA = n e g a t i v e s (A)

i n i f d i s j o i n t (PLA , NLA) t h e n [NLA]e l s e [ ]

endi n l i s t F a l s i f i e r s (Q)end

;

fun t a u t o l o g y 2 ( P ) =l e t v a l Q = c n f ( P ) ;

v a l LL = f a l s i f y (Q)i n i f n u l l ( LL ) t h e n ( t r u e , [ ] )

e l s e ( f a l s e , LL )end

;

v a l t a u t o l o g y = t a u t o l o g y 2 ;




(∗We may use t h e t a u t o l o g y c h e c k e r t o p rove v a r i o u s a rgumen t sl o g i c a l l y v a l i d o r l o g i c a l l y i n v a l i d . An argument c o n s i s t so f a s e t o f p r o p o s i t i o n s c a l l e d t h e ” h y p o t h e s e s ” and a ( s i n g l e )p r o p o s i t i o n c a l l e d t h e ” c o n c l u s i o n ” . Loose ly speak ing , an argumenti s s i m i l a r t o a theorem of m a t h e m a t i c s . The argumenti s l o g i c a l l y v a l i d i f t h e c o n c l u s i o n i s a l o g i c a l consequence o fo f t h e h y p o t h e s e s . More a c c u r a t e l y , i f i n e v e r y t r u t h a s s i g n m e n twhich makes a l l t h e h y p o t h e s e s t r u e , t h e c o n c l u s i o n i s a l s o i n v a r i a b l yt r u e t h e n t h e argument i s l o g i c a l l y v a l i d .

S y m b o l i c a l l y i f H1 , . . . , Hm a r e p r o p o s i t i o n s and C i s a n o t h e rp r o p o s i t i o n t h e n t h e argument (H1 , . . . , Hm , C) i s l o g i c a l l yv a l i d ( e q u i v a l e n t l y , C i s a l o g i c a l consequence o f H1 , . . . , Hm )i f and on ly i f t h e ( compound ) p r o p o s i t i o n

( H1 /\ . . . /\ Hm) => C

i s a t a u t o l o g y .

An argument which i s n o t l o g i c a l l y v a l i d i s l o g i c a l l yi n v a l i d . In p a r t i c u l a r i f t h e r e e x i s t s a t r u t h a s s i g n m e n t unde r whicha l l t h e h y p o t h e s e s a r e t r u e b u t t h e c o n c l u s i o n i s f a l s e , t h e n t h eargument i s i n v a l i d .

Any argument i s t r i v i a l l y l o g i c a l l y v a l i d i f t h e r e i s no t r u t ha s s i g n m e n t unde r which e v e r y h y p o t h e s i s i s t r u e . In o t h e r words ,




i f t h e s e t o f h y p o t h e s e s i s an i n c o n s i s t e n t s e t t h e n r e g a r d l e s so f what t h e c o n c l u s i o n i s , t h e argument i s a lways l o g i c a l l y v a l i d .The s e t o f h y p o t h e s e s H1 , . . . , Hm i s ” i n c o n s i s t e n t ” i f and on ly i f( H1 /\ . . . /\ Hm) i s a ” c o n t r a d i c t i o n ” ( i t i s f a l s e f o r e v e r y t r u t ha s s i g n m e n t ) .

∗ )t y p e Argument = Prop l i s t ∗ Prop ;

fun showArg (A: Argument ) =l e t fun p r i n t A r g (A: Argument a s ( [ ] , c ) ) =

( drawChar (#"-" , 8 0 ) ; p r i n t ("\n" ) ;show ( c ) ; p r i n t ("\n\n" )

)| p r i n t A r g (A: Argument a s ( p : : p l i s t , c ) ) =

( show ( p ) ; p r i n t ("\n" ) ;p r i n t A r g ( p l i s t , c )

)i n ( p r i n t ("\n\n" ) ; p r i n t A r g (A) )end

;

fun l e f t R e d u c e ( F ) =l e t e x c e p t i o n e m p t y l i s t ;

fun l r ( [ ] ) = r a i s e e m p t y l i s t| l r ( [ a ] ) = a| l r ( a : : L ) = F ( a , l r ( L ) )




i n l rend

;

v a l bigAND = l e f t R e d u c e (AND) ;

fun V a l i d ( ( L , P ) : Argument ) =i f n u l l ( L ) t h e n t a u t o l o g y ( P )e l s e t a u t o l o g y ( IMP ( bigAND ( L ) , P ) )

;

fun f a l s i f y A r g ( ( L , P ) : Argument ) =i f n u l l ( L ) t h e n f a l s i f y ( c n f ( P ) )e l s e f a l s i f y ( c n f ( IMP ( bigAND ( L ) , P ) ) )

;

end (∗ s t r u c t ∗ ) ;

(∗ open PL ; ∗ )







7. Lecture 7: Propositional Unsatisfiability

Lecture 7: PropositionalUnsatisfiability

Wednesday 10 August 2011




1. Tautology Checking

2. CNFs: Set of Sets of Literals

3. Propositional Resolution

4. Clean-up

5. The Resolution Method

6. The Algorithm

7. Resolution Examples: Biconditional

8. Resolution Examples: Exclusive-Or

9. Resolution Refutation: 1

10. Resolution Refutation: 2

11. Resolvent as Logical Consequence

12. Logical Consequence by Refutation




Tautology Checking1. Involves conversion of IMP (bigAND H, conc1) into CNF

which increases the size of the formula.2. Involves checking falsifiability of the argument.3. CNF can be obtained more easily for the formula (. . . ((φ1 ∧φ2) ∧ φ3) ∧ . . . ∧ φn) ∧ ¬ψ•Convert each individual φi and ¬ψ to CNF• and then append all the lists to obtain the required list.

4. More efficient to use theorem 4.3.2 if the technique involvesCNF.




CNFs: Set of Sets of LiteralsGiven a formula φ whose CNF is

γ ≡∧

1≤i≤mδi

where for each i, 1 ≤ i ≤ m,

δi ≡∨

1≤j≤ni

λi,j

is a disjunction of literals, we will often write γ as a set of setsof literals as follows:

γ = λ1,1, . . . , λ1,n1, · · · , λm,1, . . . , λm,nm




Note: For the sake of brevity we will abuse notation by identifying a clause (set of literals) with theformula denoting their disjunction and a set of clauses with the formula denoting their conjunction.




Propositional ResolutionTo show Γ |= ψ we show that

∧Γ∧¬ψ is false by first transform-

ing∧

Γ ∧ ¬ψ to a formula in CNF.This CNF is represented as a set of sets of literals. Let ∆ be theset of sets of literals.1. Each set C ∈ ∆ is called a clause.2. Each clause in ∆ represents a disjunction of literals.3. The empty clause represents a contradiction.4. The unsatisfiability of the set ∆ is shown by deriving the

empty clause.




Clean-upLet ∆ be a finite set of clauses.1. For all clauses C, C ′, if C ⊆ C ′, then C ′ may be deleted from

∆ without affecting logical equivalence.2. Any clause containing complementary pairs of literals, may

be deleted from ∆ without affecting logical equivalence.3. From any clause, duplicate occurrences of a literal may be

deleted without affecting logical equivalence.The resulting clause set ∆′ is said to be clean.

∧C∈∆

∨L∈CL⇔

∧C ′∈∆′

∨L′∈C ′L

′




The Resolution MethodFor any clean set ∆ and an atom p let Λ = C ∈ ∆ | p ∈ C andΛ = C ∈ ∆ | ¬p ∈ CSince ∆ is a clean set1. Λ ∩ Λ = ∅.2. However C and C may not be disjoint.3. For each C ∈ Λ and C ∈ Λ, p 6∈ C and ¬p 6∈ C.The new set of clauses obtained after resolution

resolve(∆, p)df= (∆− (Λ ∪ Λ))∪

D | D = (C − p) ∪ (C − ¬p), C ∈ Λ, C ∈ Λ

is called the resolvent.




Lemma 7.1 If C1 = λ1,i | 1 ≤ i ≤ m and C2 = λ2,j | 1 ≤ j ≤ n are clauses such thatp ∈ C1 − C2 and ¬p ∈ C2 − C1 and D = (C1 − p) ∪ (C2 − ¬p). Then

1. C1 ∧ C2 ⇒ D and

2. resolve preserves satisfiability, i.e. every truth assignment that satisfies both C1 and C2 alsosatisfies D.

Proof: From the semantics of propositional logic it follows that for any truth assignment τ , τ C1 ∧ C2 if and only if τ C1 and τ C2. Hence we may prove both parts of the lemma byconsidering an arbitrary truth assignment τ such that τ C1 ∧ C2. It suffices to show (for bothparts) that τ D. τ C1 implies that for some i0, 1 ≤ i0 ≤ m, T Jλ1,i0Kτ = 1 and for some j0,1 ≤ j0 ≤ n, T Jλ2,j0Kτ = 1. Further since p and ¬p are complementary, it is impossible that bothp ≡ λ1,i0 and ¬p ≡ λ2,j0 hold simultaneously. Hence at least one of the two literals λ1,i0, λ2,j0 is inD. It follows therefore thatT JDKτ = (

∑1≤i≤m

T Jλ1,iKτ )+(∑

1≤j≤n,j 6=j0

T Jλ2,jKτ ) = 1 if p 6≡ λ1,i0 and

T JDKτ = (∑

1≤i≤m,i 6=i0

T Jλ1,iKτ )+(∑

1≤j≤nT Jλ2,jKτ ) = 1 if ¬p 6≡ λ2,0 and hence τ D. QED




The Algorithm

Require: ∆ a clean set of clauses1: while ( 6∈ ∆) ∧ ∃(p,¬p) ∈ ∆ do2: ∆′ := resolve(∆, p)3: ∆ := Clean-up(∆′)4: end while

Note:1. is the empty clause which represents the proposition ⊥ (it

represents the disjunction of an empty set of literals).2. The presence of the empty clause in a set of clauses also

indicates the unsatisfiability of the set of clauses.




Lemma 7.2 Let ∆1 = Clean-up(resolve(∆0)). Then

1. ∆0 ⇒ ∆1 and

2. if ∆0 is satisfiable then ∆1 is satisfiable.

Proof: The proof follows directly from the proof of lemma 7.1 and from problem 5 of exercise 4.1,where it should have been shown that both ∧ and ∨ preserve logical implication. QED

Corollary 7.3 If ∆1 and ∆0 are as in lemma 7.2 then if ∆1 is unsatisfiable then so is ∆0.

2

Theorem 7.4 Given a clean non-empty set ∆0 of non-empty clauses, the propositional resolutionalgorithm terminates in at most |atoms(∆0)| iterations, deriving either an empty clause or a set ofnon-empty clauses which are satisfied by every model of the original set ∆0.

Proof: Since in each iteration one atom and its negation are completely eliminated, |atoms(∆0)| isthe number of iterations possible. Further by applying lemma 7.2 to the result of each iteration weget that satisfiability and logical implication are preserved. QED




7.1. Space Complexity of Propositional Resolution.

Assume n is the number of atoms of which ∆ is made up. After some iterations of resolution andcleanup, assume there are k distinct atoms in the set of clauses on which resolution is applied. Afterperforming the cleanup procedure there could be at most 2k clauses with each clause containing atmost k literals. Assuming each literal occupies a unit of memory, the space requirement is given bysize(∆) = 2kk literals.

For any complementary pair (p,¬p), it is possible that at most half of the 2k (i.e. 2k−1) clausescontain p and the other half contain ¬p. This would be the worst-case scenario, as it yields themaximum number of new clauses. Therefore in performing a single step of resolution over allpossible pairs of clauses to yield a new set ∆′ of clauses, a maximum of 2k−1×2k−1 unions of distinctpairs of clauses needs to be performed. Before applying the clean-up procedure the space requiredcould be as high as 2k−1×2k−1 = 22(k−1) > 2kk for k > 4. But since ∆′ is made up of at most (k−1)

atoms, size(∆′) ≤ 2k−1(k − 1), after clean-up the space requirement reduces to 2k−1(k − 1). Sincek ≤ n the maximum space required after the first application of resolution and before cleaning upexceeds the space required for all other iterations and is bounded by 22(n−1) = O(22n).




7.2. Time Complexity of Propositional Resolution

Given a space of 2kk to represent the clauses containing at most k atoms, we require a time pro-portional to this amount of space in order to identify which clauses have to be resolved against aparticular complementary pair. After resolution we create a space of 22(k−1) which has to be scannedfor the cleanup operations. Hence the amount of time required to perform a step of resolution andthe amount of time required to perform the cleanup are both proportional to 22(k−1). Hence the totaltime required for performing resolution followed by cleanup in n iterations (which is the maximumpossible) is given by

T (n) ≥∑n

k=1 22k−2

which is clearly exponential.

Hence both the worst case time and space complexities are exponential in the number of atoms.




Resolution Examples: BiconditionalExample 7.5 Sometimes there may be more than one comple-mentary pair of literals in the same pair of clauses. Considerthe biconditional operator (↔) on atomic propositions p and q.We have

p↔ q ⇔ (p ∨ ¬q) ∧ (¬p ∨ q) ≡ p,¬q, ¬p, q

Applying resolution on the pair of literals p and ¬p we obtain theclause set which after clean-up yields the empty set of clauses.

q,¬q ≡ ≡ >

There is really nothing more to this resolvent than that > is alogical consequence of any proposition (including ⊥).




Resolution Examples: Exclusive-OrExample 7.6 The exclusive-or operation ⊕ is simply the nega-tion of the biconditional operator↔. Hence we have

p⊕ q ⇔ (p ∨ q) ∧ (¬p ∨ ¬q) ≡ p, q, ¬p,¬q

which on resolution on the pair (p,¬p) and subsequent clean-up again yields the empty set of clauses.

q,¬q ≡ ≡ >




Resolution Refutation: 1Example 7.7 Consider the simple logical consequence

p ∧ q |= p

which we prove by resolution refutation. The set of clauses rep-resenting the hypothesis and the negation of the conclusion isp, q, ¬p. Resolving on the pair (p,¬p) yields

, q

Notice that ≡ ⊥ ≡ , q.Since for any clause C 6= ∅, C ⊇ , the clean-up always re-duces every set of clauses ∆ to whenever ∈ ∆.




Resolution Refutation: 2Example 7.8 Consider the simple logical consequence

p ∧ q |= p↔ q

which we prove by resolution refutation. Negating the conclu-sion yields p⊕ q ≡ p, q, ¬p,¬q. The set of clauses rep-resenting the hypothesis and the negation of the conclusion isp, q, p, q, ¬p,¬q which after clean-up yields

p, q, ¬p,¬q

Resolving on the pair (p,¬p) produces

q, ¬q

and then on the pair (q,¬q) produces the empty clause .




Resolvent as Logical ConsequenceThe set of clauses resulting from any application of resolutionis a set of clauses that represents a logical consequence of theoriginal set of clausesExample 7.9 We use resolution to prove

(p ∨ q) ∧ (p ∨ r) ∧ ¬p⇔ q ∧ r

The set of clauses p, q, p, r, ¬p may be resolved on thepair (p,¬p) to yield the set

q, r ≡ q ∧ r

Note that we have resolved all occurrences of the complemen-tary pair (p,¬p).




Logical Consequence by RefutationExample 7.10 In example 7.9 since we resolve all occur-rences of the complementary pair (p,¬p), we could nothave proved that ¬p ∧ q ∧ r is also a logical consequence of(p ∨ q) ∧ (p ∨ r) ∧ ¬p which it indeed is. We may use refutationfor this purpose by noting that ¬(¬p ∧ q ∧ r) ≡ p,¬q,¬r. Wethen resolve the set p, q, p, r, ¬p, p,¬q,¬r on the pair(p,¬p) to obtain the set q, r, ¬q,¬r which may be re-solved on the pairs (q,¬q) and (r,¬r) subsequently to yield theempty clause.










8. Lecture 8: Analytic Tableaux

Lecture 8: Analytic TableauxFriday 12 August 2011




1. Against Resolution

2. The Analytic Tableau Method

3. Basic Tableaux Facts

4. Tableaux Rules

5. Structure of the Rules

6. Tableaux

7. Slim Tableaux




Against Resolution1. For any argument, it was still necessary to transform the ar-

gument into a mammoth formula in CNF in order to be ableto perform resolution.

2. The numbers of clauses and sizes of clauses could tem-porarily increase as a result of resolution.

3. Termination relied on the reduction of the number of atomsat each step of resolution.

4. Resolution also requires a clean-up of the initial set ofclauses to work correctly.




The Analytic Tableau Method1. Like resolution, the tableau method also checks for the un-

satisfiability of a set of formulae.2. Like resolution, each step of the method preserves satisfiabil-

ity.3. Unlike resolution(a) There are no transformations of mammoth formulae into

a normal form (esp. the use of distributivity which can in-crease sizes of formulae).

(b) It works with the list of formulae [φ1, . . . , φn,¬ψ] directly bybreaking up the formulae and building a tree called thetableau

(c) It relies on the symmetry between truth and falsehood at asemantic level to check for satisfiability or unsatisfiability.




Basic Tableaux FactsTheorem 8.1 For any truth assignment τ , and formulae φ, ψ,

T J¬φKτ = 1 ⇒ T JφKτ = 0T J¬φKτ = 0 ⇒ T JφKτ = 1T Jφ ∧ ψKτ = 1 ⇒ T JφKτ = 1 & T JψKτ = 1

T Jφ ∧ ψKτ = 0 ⇒ T JφKτ = 0 T JψKτ = 0

T Jφ ∨ ψKτ = 1 ⇒ T JφKτ = 1 T JψKτ = 1T Jφ ∨ ψKτ = 0 ⇒ T JφKτ = 0 & T JψKτ = 0

T Jφ→ ψKτ = 1 ⇒ T JφKτ = 0 T JψKτ = 1T Jφ→ ψKτ = 0 ⇒ T JφKτ = 1 & T JψKτ = 0

T Jφ↔ ψKτ = 1 ⇒ T JφKτ = 1 = T JψKτ T JφKτ = 0 = T JψKτT Jφ↔ ψKτ = 0 ⇒ T JφKτ = 0 = T JψKτ T JφKτ = 1 = T JψKτ




Tableaux Rules¬¬. ¬¬φ

φ

∧. φ ∧ ψφψ

¬ ∧ . ¬(φ ∧ ψ)

¬φ ¬ψ

∨. φ ∨ ψφ ψ

¬ ∨ . ¬(φ ∨ ψ)

¬φ¬ψ

→ .φ→ ψ

¬φ ψ¬ → .

¬(φ→ ψ)

φ¬ψ

↔ .φ↔ ψ

φ ∧ ψ ¬φ ∧ ¬ψ¬ ↔ .

¬(φ↔ ψ)

φ ∧ ¬ψ ¬φ ∧ ψ




Structure of the RulesThe tableaux rules are of two kinds:Elongation rules Each of the rules ¬¬, ∧, ¬∨ and ¬ → elon-

gates the path without increasing the size of the tableau(since the original formula to which a rule was applied maybe discarded)

Branching rules These are the rules ¬∧, ∨, ¬ →, ↔ and ¬ ↔which lead to branching of the tableau.




Tableaux1. A tableau is a tree where each path of the tableau represents

a conjunction of “unbroken” formulae.2. Each application of the tableaux rules preserves satisfiability

of the conjunction of “unbroken” formulae in each path.3. A path of the tableau is closed if it contains a complementary

pair (the conjunction of the formulae in the path is clearlyunsatisfiable).

4. The result of applying a tableau rule to an ancestor node hasto be distributed in all branches of its descendants.

5. A tableau is closed if every path in the tableau is closed sig-nifying that the original set of formulae is unsatisfiable.




Slim Tableaux1. Any formula which has been broken up by a tableau rule may

be discarded.2. Any branch which has been closed may be discarded.3. Any formula which dominates several branches of the

tableau creates multiple copies (one in each branch of itsdescendants) when it is broken up.

By applying the elongation rules first the number of branchesover which elongation rules have to be replicated can be reduced










9. Lecture 9: Consistency & Completeness

Lecture 9: Consistency &Completeness

Wednesday 17 August 2011




1. Tableaux Rules: Restructuring

2. Tableaux Rules: 2

3. Tableau Proofs

4. Consistency

5. Unsatisfiability

6. Hintikka Sets

7. Hintikka’s Lemma

8. Tableaux and Hintikka sets

9. Completeness




Tableaux Rules: RestructuringIn general the elongation and branching rules of the tableaulook like this

Elongation.φ

ψχ

Branching.φ

ψ χ

where ψ and χ are subformulae of φ.Let Γ = ∆ ∪ φ where φ 6∈ ∆ be a set of formulae. It willbe convenient to use sets of formulae in the tableau rules. Theelongation and branching rules are rendered as follows respec-tively

Elongation.∆ ∪ φ

∆ ∪ ψ, χBranching.

∆ ∪ φ∆ ∪ ψ ∆ ∪ χ




Tableaux Rules: 2⊥. ∆ ∪ φ,¬φ

⊥¬¬. ∆ ∪ ¬¬φ

∆ ∪ φ∧. ∆ ∪ φ ∧ ψ

∆ ∪ φ, ψ¬∧ . ∆ ∪ ¬(φ ∧ ψ)

∆ ∪ ¬φ ∆ ∪ ¬ψ∨. ∆ ∪ φ ∨ ψ

∆ ∪ φ ∆ ∪ ψ¬∨. ∆ ∪ ¬(φ ∨ ψ)

∆ ∪ ¬φ,¬ψ

→ .∆ ∪ φ→ ψ

∆ ∪ ¬φ ∆ ∪ ψ¬→ .

∆ ∪ ¬(φ→ ψ)∆ ∪ φ,¬ψ

↔ .∆ ∪ φ↔ ψ

∆ ∪ φ ∧ ψ ∆ ∪ ¬φ ∧ ¬ψ¬↔ .

∆ ∪ ¬(φ↔ ψ)∆ ∪ φ ∧ ¬ψ ∆ ∪ ¬φ ∧ ψ




Tableau Proofs1. A tableau is a tree rooted at a node containing a set Γ of

formulas2. Each application of

an elongation ruleΓ

Γ′to a leaf Γ of the tableau extends the

path to Γ′,

a branching ruleΓ

Γ′ Γ′′to a leaf Γ of the tableau extends the

tableau to two leaves Γ′ and Γ′′.3. A path of the tableau is closed if its leaf is ⊥.4. The tableau is closed if every path is closed, otherwise the

the tableau is open.




ConsistencyDefinition 9.1 A set Γ of formulas is consistent if it is satisfiablei.e. there is a truth assignment under which every formula of Γis true. Otherwise, it is inconsistent.Fact 9.2 Every non-empty subset of a consistent set is alsoconsistent

Lemma 9.3 Each tableau rule preserves satisfiability in the fol-lowing sense.

Elongation RulesΓ

Γ′If the numerator Γ is satisfiable then so is

the denominator Γ′.

Branching Rules (Γ

Γ′ Γ′′) If the numerator Γ is satisfiable then

at least one of the denominators Γ′ or Γ′′ is satisfiable.




Proof outline of lemma 9.3

Proof: It may be shown that for any truth assignment τ ,

Elongation RulesΓ

Γ′. if every formula in Γ is true then every formula in Γ′ is also true under τ .

Branching RulesΓ

Γ′|Γ′′. if every formula in Γ is true under τ then every formula in Γ′ or every formula in Γ′′ is true

under τ .

QED




UnsatisfiabilityDefinition 9.4 A tableau is completed if no leaf in any path maybe extended.

Corollary 9.5 If Γ is satisfiable then there exists a completedtableau rooted at Γ which has a satisfiable leaf.

Corollary 9.6 A set Γ is unsatisfiable if there exists a closedtableau rooted at Γ.

Question. If a completed tableau rooted at Γ is closed could there beother completed tableaux rooted at Γ which might be open?




Hintikka SetsDefinition 9.7 A finite or infinite set Γ is a Hintikka set if1.⊥ 6∈ Γ and for any p ∈ A, p,¬p 6⊆ Γ,2. If φ ≡ ψ χ ∈ Γ for ∈ ∧,¬∨,¬ → then ψ′, χ′ ⊆ Γ,3. If φ ≡ ψ ⊕ χ ∈ Γ for ⊕ ∈ ∨,¬∧,→,↔,¬ ↔ then ψ′, χ′ ∩

Γ 6= ∅where ψ′ and χ′ are defined by the following table

φ ≡ ψ χ ψ′ χ′ φ ≡ ψ ⊕ χ ψ′ χ′

ψ ∧ χ ψ χ ¬(ψ ∧ χ) ¬ψ ¬χ¬(ψ ∨ χ) ¬ψ ¬χ ψ ∨ χ ψ χ¬(ψ → χ) ψ ¬χ ψ → χ ¬ψ χ

ψ ↔ χ ψ ∧ χ ¬ψ ∧ ¬χ¬(ψ ↔ χ) ¬ψ ∧ χ ψ ∧ ¬χ




Hintikka’s LemmaLemma 9.8 Every Hintikka set is satisfiable.

Proof: Let Γ be a Hintikka set. For any atom p, since p,¬p 6⊆Γ, consider the following truth assignment τ .1. τ (p) = 1 if p ∈ Γ,2. τ (p) = 0 if ¬p ∈ Γ and3. if p,¬p ∩ Γ = ∅ then choose any value (say 1 for definite-

ness).We may then show by induction on the degree of formulae in Γthat each formula in Γ is satisfiable. QED




Tableaux and Hintikka setsTheorem 9.9 Let Γ0,Γ1 · · · ,Γn be an open path of acompleted tableau. Then

⋃0≤m≤n

Γm is a Hintikka set.

Proof: We may prove that each rule in Tableaux Rules: 2creates a path for the construction of Hintikka sets. QED




CompletenessTheorem 9.10 Completeness of the Tableau Method1. If φ is a tautology then every completed tableau rooted at¬φ is closed.

2. Every tautology is provable by the tableau method.

Proof:

1. Suppose T is a completed tableau rooted at ¬φ which isopen. Then by corollary 9.5 ¬φmust be satisfiable and henceφ cannot be a tautology. Hence T must be closed.

2. If φ is a tautology that cannot be proved by the tableaumethod, there must exist a completed tableau rooted at ¬φwhich has an open path. But that implies ¬φ is satisfiablewhich implies that φ is not a tautology.

QED










10. Lecture 10: The Compactness Theorem

Lecture 10: The CompactnessTheorem

Friday 19 August 2011




1. Satisfiability of Infinite Sets

2. The Compactness Theorem

3. Inconsistency

4. Consequences of Compactness




Satisfiability of Infinite SetsFrom corollaries 9.5 and 9.6 we haveCorollary 10.1 A finite set Γ is unsatisfiable iff there is a closedtableau rooted at Γ.

Corollary 10.2 If a finite set Γ is satisfiable then everynonempty subset of Γ is satisfiable too.

Question 1. Suppose Γ were a denumerable (countably infinite) set.Under what conditions is Γ satisfiable?Question 2. Suppose every subset of a denumerable set Γ is satisfi-able. Then is Γ necessarily satisfiable?Question 3. Suppose that only all finite subsets of a denumerable setΓ are satisfiable. Then is Γ satisfiable?




The Compactness TheoremTheorem 10.3 (The Compactness Theorem) A (countably) infi-nite set is satisfiable if all its nonempty finite subsets are satis-fiable.

2

Corollary 10.4 Any (finite or infinite) set of formulae is satisfi-able iff all its non-empty finite subsets are satisfiable.Note:• If Γ is a countably infinite set then it can be placed in 1-1

correspondence with the set N of naturals and hence thereis some enumeration of its formulae and each formula carriesan unique index from N.




Proof of the Compactness Theorem

Proof: Let Γ be a countably infinite set of propositions. Then clearly Γ may be enumerated in someorder, say

φ0, φ1, φ2, . . . (7)

where each φj has the unique index j ≥ 0. For each m ≥ 0, let Γm = φ0, φ1, φ2, . . . , φm.

Claim. Every nonempty finite subset of Γ is satisfiable iff for each m ≥ 0, Γm is satisfiable.` (⇒) clearly holds since each Γm is a finite subset.(⇐) Let ∅ 6= ∆ ⊆f Γ. Let k ≥ 0 be the index of the formula with the highest index in ∆.Clearly ∆ ⊆f Γk. Since the set Γk is satisfiable, by corollary 10.2, ∆ is also satisfiable. a

Hence it suffices to prove that if each of the Γi, i ≥ 0, is satisfiable then Γ is satisfiable.

Consider a tableau T0 rooted at Γ0 constructed using the tableau rules. Since Γ0 is satisfiable, T0

has one or more open paths. Extend each of the open paths with the formula φ1 and continue thetableau. The resulting tableau T1 is for the set Γ1 and it does not close either. Hence tableaux Tk foreach Γk may be extended to yield open tableaux Tk+1 for Γk+1.Consider the final tableau T obtained by this process of extension. T is a finitely branching infinitetree with at least one path that does not close. By Konig’s Lemma 2.17 there is an infinite path. Let




Φ be the set of all formulae in this path. Since this path contains each of the formulae φi ∈ Γ, wehave Γ ⊆ Φ and further Φ is a Hintikka set. By Hintikka’s lemma 9.8 this set must be satisfiable.QED




InconsistencyCorollary 10.5 A set Γ is inconsistent if some nonempty finitesubset of Γ is unsatisfiable.

Proof: Follows from the compactness theorem 10.3 and itscorollary 10.4. QED

Facts 10.61. Any superset of an inconsistent set is also inconsistent.2. Any set containing a complementary pair is inconsistent.3. (see table) If ∆ ∪ ψ′, χ′ is inconsistent then so is ∆ ∪ φ

where φ ≡ ψ χ4. (see table) If both ∆∪ψ′ and ∆∪χ′ are inconsistent then

so is ∆ ∪ φ where φ ≡ ψ ⊕ χ.




Consequences of CompactnessCorollary 10.7 Given a finite or infinite set Γ, and a formula ψ1. Γ ∪ ¬ψ is inconsistent iff there exists ∆ = φi | 1 ≤ i ≤n ⊆f Γ, n ≥ 0, such that ∆ ∪ ¬ψ is inconsistent.

2. Γ |= ψ iff ∆ |= ψ iff (∧

∆)→ ψ is a tautology, for some ∆ =φi | 1 ≤ i ≤ n ⊆f Γ.

Hence1. to show that an argument is valid it suffices to prove that the

conclusion follows from a finite subset of the hypotheses.2. to show invalidity of an argument it suffices to find a finite

subset of the hypotheses which are inconsistent with the con-clusion.










11. Lecture 11: Maximally Consistent Sets

Lecture 11: Maximally Consistent SetsTuesday 23 August 2011




1. Consistent Sets

2. Properties of Finite Character: 1



5. Maximally Consistent Sets

6. Lindenbaum’s Theorem




Consistent SetsLemma 11.1 If Γ is a consistent set then for any formula φ atleast one of the two sets Γ1 = Γ ∪ φ or Γ0 = Γ ∪ ¬φ isconsistent.

2




Proof of lemma 11.1

Proof: Suppose Γ is consistent but both Γ0 and Γ1 are inconsistent. Then by compactness and bydefinition 10.5 there must be consistent finite subsets ∆0,∆1 ⊆f Γ such that Γ′0 = ∆0 ∪ ¬φ andΓ′1 = ∆1 ∪ φ are both inconsistent. Let ∆01 = ∆0 ∪ ∆1. By facts 10.6.1 both ∆01 ∪ ¬φ and∆01 ∪ φ are inconsistent and hence unsatisfiable whereas ∆01 ⊆f Γ is consistent. Hence there isa truth assignment τ which satisfies ∆01, and such that

T JφKτ = 0 = T J¬φKτ

which is impossible. QED




Properties of Finite Character: 1Definition 11.2 A property p of sets is called a property of finitecharacter if for any set S, S has the property p iff every finitesubset of S has the property p.

Notation: S p denotes the statement “S has property p”.




Properties of Finite Character: 2Example 11.31. The property of a partially ordered set being totally ordered

is a property of finite character. That is, if 〈P,≤〉 is a partiallyordered set, then P is totally ordered (i.e. for every a, b ∈ P ,a ≤ b or b ≤ a) iff every finite subset of P is totally ordered.

2. However the property of a totally ordered set 〈T,≤〉 beingwell-ordered is not a property of finite character since everyfinite subset of T is well-ordered, but T itself may not be well-ordered (e.g. take the set of integers Z under the usual ≤relation).

3. By the corollary 10.4 to the compactness theorem, consis-tency/satisfiability is a property of finite character.




Properties of Finite Character: 3Theorem 11.4 (Tukey’s Lemma) For any denumerable universeU and any property p of finite character of subsets of U , any setS ⊆ U such that S p can be extended to a maximal set S∞such that S ⊆ S∞ ⊆ U with S∞ p.




Proof of Tukey’s Lemma

Proof: Let S ⊆ U be a set with S p. Since U is denumerable its elements can be enumerated insome order

a1, a2, a3, . . . (8)Starting with S = S0 consider the sets Si+1, i ≥ 0

Si+1 =

Si ∪ ai+1 if Si ∪ ai+1 p

Si otherwiseClearly we have the infinite chain

S = S0 ⊆ S1 ⊆ S2 ⊆ · · ·such that for each Si, Si p. Let S∞ =

⋃i≥0 Si.

Claim. S∞ p.` Let T ⊆f S∞, then T ⊆f Si for some i ≥ 0. Since Si p, p is a property of finite characterand T ⊆f Si, T p. Hence every finite subset of S∞ has property p. Therefore since p is aproperty of finite character, S∞ p. a

Claim. S∞ is maximal.` Suppose there exists an element a ∈ U such that S∞ ∪ a p. We know from the




claim above that S∞ p and from the construction of each Si, that Si p for each i ≥ 0.Clearly a = aj+1 for some j ≥ 0 in the enumeration (8). Hence Sj ∪ aj+1 p. ButSj ∪ aj+1 = Sj+1 ⊆ S∞. Hence S∞ = S∞ ∪ a and S∞ is maximal. a

QED




Maximally Consistent SetsDefinition 11.5 A set ∆ is a maximally consistent set if it is sat-isfiable and no proper superset of ∆ is consistent.

Corollary 11.6 For any maximally consistent set ∆, and anyformula φ, either φ ∈ ∆ or ¬φ ∈ ∆ but not both.




Lindenbaum’s TheoremTheorem 11.7 (Lindenbaum’s Theorem) Every consistent setcan be extended to a maximally consistent set. More preciselyfor every consistent Γ there exists a maximally consistent setΓ∞ ⊇ Γ.

Proof: By definition 11.2 and corollary 10.4 consistency ofsets of formulae is a property of finite character in the universeP0. From theorem 11.4 it follows that any set Γ ⊆ P0 may beextended to a maximally consistent set Γ∞. QED




Alternative Proof of Lindenbaum’s Theorem ab initio

Proof: Let Γ be a consistent set. Since P0 is generated from a countably infinite set of atoms anda finite set of operators, P0 is a countably infinite set (see problem 1 of exercise 2.1). Hence theformulae of P0 can be enumerated in some order

φ1, φ2, φ3, . . . (9)

Starting with Γ = Γ0 consider the sets

Γi+1 =

Γi ∪ φi+1 if Γi ∪ φi+1 is consistentΓi otherwise

Clearly we have the infinite chain

Γ = Γ0 ⊆ Γ1 ⊆ Γ2 ⊆ · · ·

such that each Γi is consistent. Let Γ∞ =⋃i≥0 Γi.

Claim. Γ∞ is consistent.` Let ∆ ⊆f Γ∞, then since ∆ is finite, it must be the subset of some Γi. Since Γi is consistent,so is ∆. Hence every finite subset of Γ∞ is consistent. Therefore by the compactness theoremΓ∞ is consistent. a




Claim. Γ∞ is maximal.` Suppose there exists a formula φ such that Γ∞ ∪ φ is consistent. Clearly φ ≡ φi+1 forsome i ≥ 0 in the enumeration (9). Since Γ∞∪φi+1 is consistent, by fact 9.2 Γi∪φi+1 ⊆Γ∞∪φi+1 is also consistent. But then Γi+1 = Γi∪φi+1 ⊆ Γ∞ and hence Γ∞ = Γ∞∪φ.Hence Γ∞ is maximal. a

QED




Exercise 11.1

1. Let 〈P,≤〉 be a finite partial order. Prove using Konig’s lemma that every element of P liesbetween a maximal element and a minimal element i.e. for each a ∈ P there exist a minimalelement l ∈ P and a maximal element u ∈ P such that l ≤ a ≤ u.

2. Prove that every maximally consistent set is a Hintikka set.

3. For any given consistent set Γ of formulae, there may exist more than one maximally consistentextension. Give examples of Γ and ψ such that there are two maximally consistent extensions,Γ∞ and Γ′∞ with ψ ∈ Γ∞ and ¬ψ ∈ Γ′∞.

4. (Tarski’s theorem) For any set Γ, of formulae, the set Γ|= called the closure under logical con-sequence is defined as

Γ|= = ψ ∈ P0 | ∆ |= ψ, for some ∆ ⊆f ΓLet MC (Γ) = Γ∞ | Γ∞ is a maximally consistent extension of Γ be the set of all maximallyconsistent extensions of Γ. Prove that

Γ|= =⋂

Γ∞∈MC (Γ)

Γ∞

for every consistent set Γ.




5. (Interpolation) For any finite set V ⊆f A, define TV = τV | τV : V → ⊥,> andfor any formula χ such that V ⊆ atoms(χ) and any τV ∈ TV , let τV (χ) denote the formulaobtained from χ by replacing all occurrences of each atom p ∈ V by the atom τV (p). Further letTV (χ) = τV (χ) | τV ∈ TV .Let X, Y, Z ⊆f A be pairwise disjoint (finite) sets of atoms and let φ and ψ be formulae suchthat

• atoms(φ) ⊆ X ∪ Y ,

• atoms(ψ) ⊆ Z ∪ Y and

• |= φ→ ψ

(a) Let λdf=∨TX(φ) and ρ

df=∧TZ(ψ). Then prove that

i. |= φ→ λ

ii. |= λ→ ρ

iii. |= ρ→ ψ

(b) Prove that for any formula θ with atoms(θ) ⊆ Y , if |= φ→ θ and |= θ → ψ then |= λ→ θ

and |= θ → ρ. θ is called an interpolant of φ and ψ.







12. Lecture 12: Formal Theories

Lecture 12: Formal TheoriesWednesday 24 August 2011




1. Introduction to Reasoning

2. Proof Systems: 1

3. Requirements of Proof Systems

4. Proof Systems: Desiderata

5. Formal Theories

6. Formal Language

7. Axioms and Inference Rules

8. Axiomatic Theories

9. Syntax and Decidability

10. A Hilbert-style Proof System

11. Rule Patterns




Introduction to Reasoning1. The methods discussed – truth-table, tautology checking,

resolution and tableau – are useful for automated deduction,but

2. they do not reflect the process of reasoning employed by hu-mans and used most often in mathematical proofs called de-duction.

3. Deduction enables the proof of validity of arguments but sel-dom their invalidity.




Proof Systems: 1A proof system for deduction1. prohibits the use of meaning in drawing conclusions.2. has a number of axioms (or axiom schemas) and a small

number of (finitary) inference rules.3. Each proof is a finite tree where each node of the tree is ei-

ther an assumption or an axiom or is obtained by pattern-matching and substitution from the axioms and inferencerules.

4. Each proof can be “checked” manually or verified by machineimplementable algorithms.




Requirements of Proof SystemsSyntactic. Proof systems are purely syntactic and no use is

made of semantics in any proof.Finitary. All axioms, axiom-schemas and rules of inference

must be expressible in a finitary manner.Decidability. The correctness of any application of a rule of in-

ference must be machine-verifiable.Soundness. The system must allow the deduction of only valid

conclusions from the assumptions.Completeness. The system must allow all valid truths to be de-

duced.

The semantics may be used to prove only the soundness andcompleteness of the proof system.




Proof Systems: DesiderataThere are two conflicting desirable properties of proof systems.Minimality. Inspired by Euclid and the controversy over the par-

allel postulate. Is there a minimal set of axioms and inferencerules from which all truths and only truths may be deduced?

Naturalness. Is there a natural intuitive set of axioms and rules fromwhich all truths and only truths may be deduced?




Formal TheoriesDefinition 12.1 A formal theory T = 〈L,A ,R〉 consists ofFormal Language a formal language L.Axioms a subset A of the language L.Inference Rules a set R of inference rules.




Formal Language1. An alphabet Σ = X ∪ Ω ∪ (, ) consisting of a set X of vari-

ables a set Ω of connectives each with a pre-defined arity andgrouping symbols.

2.L is defined inductively on Σ.3. The well-formed formulas or wffs of L are defined inductively

on the alphabet.4. Membership of strings (from the alphabet) in L is decidable

i.e. there exists an algorithm to decide whether a given stringis a well-formed formula

Formal Theories




Axioms and Inference RulesAxioms A decidable subset of L.Inference Rules A finite set of rules.

1. Each rule R of arity m ≥ 0 is a decidable relationR ⊆ Lm × L i.e. there exists an algorithm which for anyφ1, . . . , φm, ψ can determine whether ((φ1, . . . , φm), ψ) ∈ R

2. For each ((φ1, . . . , φm), ψ) ∈ R, φ1, . . . , φm are called thepremises and ψ a direct consequence by virtue of R.

3. Each such rule is presented in the form R.X1 · · ·Xm

Ywhere the variables X1, · · ·, Xm, Y are the “shapes” of theformulae allowed by the rule.

4. If m = 0, R is called an axiom schemaFormal Theories




Axiomatic TheoriesDefinition 12.2• The axioms and rules of inference of a formal theory together

constitute a proof system for the set of wffs in the theory.• A formal theory is said to be axiomatic if there exists an algo-

rithm to decide whether a given wff is an axiom.




Syntax and Decidability1. The purely syntactic nature of a formal theory and all the

decidability constraints usually means that each axiom andrule of inference is expressed in terms of syntactic patternsobeying certain “shape” constraints.

2. Each application of an axiom (schema) or inference rule re-quires pattern-matching and substitution.

3. The notion of a deduction is not only syntactic but is verifiableby an algorithm given the nature of the formal theory.

4. Further the deductions of a formal theory can be generatedby an algorithm (the set of “theorems” is recursively enumer-able).




A Hilbert-style Proof SystemDefinition 12.3 H0, the Hilbert-style proof system for Proposi-tional logic consists of• The set L0 generated from A and ¬,→• The following three axiom schemas

S.(X → (Y → Z))→ ((X → Y )→ (X → Z))

K.X → (Y → X)

N.(¬Y → ¬X)→ ((¬Y → X)→ Y )

• A single rule of inference modus ponens

MP.X → Y ,X

Y




Rule Patterns1. The axiom schema K states that for all (simultaneous) substi-

tutions φ/X,ψ/Y the formulae φ→ (ψ → φ) are all axiomsof the system.

2. The rules specify patterns and shapes of formulae. Thusmodus ponens specifies the relation

MP = ((φ→ ψ, φ), ψ) | φ, ψ ∈ L0

and thus asserts that ψ is a direct consequence of φ→ ψand φ for all formulae φ and ψ.

3. An application of the rule consists of identifying appropriatesubstitutions of the variables X and Y by formulae in L0 toyield a direct consequence by the same substitution.










13. Lecture 13: Proof Theory: Hilbert-style

Lecture 13: Proof Theory: Hilbert-styleWednesday 26 August 2011




1. More About Formal Theories

2. An Example Proof

3. Formal Proofs

4. Provability and Formal Proofs

5. The Deduction Theorem

6. About Formal Proofs




More About Formal TheoriesThe following properties follow easily from the definition of aformal theory.Theorem 13.1 Let Γ and ∆ be finite sets of wffs in a theory T.Monotonicity If Γ ⊆ ∆ and Γ ` ψ, then ∆ ` ψ.Compactness ∆ ` ψ if and only if there is a finite subset Γ ⊆ ∆

such that Γ ` ψ.Substitutivity If ∆ ` ψ and for each φ ∈ ∆, Γ ` φ, then Γ ` ψ.




An Example ProofWe formally deduce φ→ φ for any formula φ using the proofsystem H0. As is normal practice in mathematics the proof ispresented as a sequence of steps.1. φ→ ((φ→ φ)→ φ) φ/X, φ→ φ/Y K




An Example ProofWe formally deduce φ→ φ for any formula φ using the proofsystem H0. As is normal practice in mathematics the proof ispresented as a sequence of steps.1. φ→ ((φ→ φ)→ φ) φ/X, φ→ φ/Y K2. (φ→ ((φ→ φ)→ φ))→ ((φ→ (φ→ φ))→ (φ→ φ))φ/X, φ→ φ/Y , φ/ZS





3. ((φ→ (φ→ φ))→ (φ→ φ)) 2, 1MP





3. ((φ→ (φ→ φ))→ (φ→ φ)) 2, 1MP

4. (φ→ (φ→ φ)) φ/X, φ/Y K





3. ((φ→ (φ→ φ))→ (φ→ φ)) 2, 1MP

4. (φ→ (φ→ φ)) φ/X, φ/Y K5. φ→ φ 3, 4MP

where each step is justified as an instance of an axiom schemaor a rule.




However the proof is better written out as an “upside-down” prooftree where

1. each node is a formula. The leaves are axioms (or empty in thecase of axiom schemas).

2. each internal node is an application of a rule of appropriate arityapplied to the appropriate target (definition 2.11) nodes in the tree.

3. The line-segments between the various levels on the tree showhow a node depends on the nodes in the immediately “succeeding”level.

4. the root node is the final formula to be proven.5. The labels on each line-segment separating a direct consequence

from its premise(s) also provide the justification.




The same proof may then be rendered as follows:

1φ→ ((φ→ φ)→ φ)

2(φ→ ((φ→ φ)→ φ))→ ((φ→ (φ→ φ))→ (φ→ φ))

3((φ→ (φ→ φ))→ (φ→ φ))

4(φ→ (φ→ φ))

5 φ→ φ

where the justifications of each step are

1. φ/X, φ→ φ/Y K

2. φ/X, φ→ φ/Y , φ/ZS

3. 2, 1MP

4. φ/X, φ/Y K

5. 3, 4MP

Each node in this proof tree is said to be an instance of a rule or an axiom-schema.




Formal ProofsDefinition 13.2• A formal proof of a formula φ from a finite set Γ of formulae

is a finite tree of formulae– rooted at the formula φ,– the leaves are axioms or instances of axiom schemas or

members from Γ.– each non-leaf node is a direct consequence of one or more

nodes at the “succeeding” level by virtue of application ofa rule of inference of the appropriate arity.

• φ is said to be (formally) provable from Γ in the proof systemH0 and denoted Γ `H0

φ if there exists a formal proof of φ inthe system H0.• φ is a (formal) theorem if Γ = ∅ and is denoted `H0

φ




Provability and Formal ProofsFacts 13.3 Given any theory T = 〈L,A ,R〉 and a wff ψ ∈ L,1. If ψ is an axiom or instance of an axiom-schema then ` ψ

and hence ψ is a formal theorem.2. If ` ψ then all the leaf nodes in any proof tree of ψ are either

axioms or instances of axiom-schemas.3. If ψ is an axiom or an instance of an axiom-schema then

Γ ` ψ for any Γ ⊆ L.4. For any φ ∈ Γ, Γ ` φ.




The Deduction TheoremNotation Given a set Γ and a formula φ, “Γ, φ ` ψ” denotes“Γ ∪ φ ` ψ”Theorem 13.4 (The Deduction Theorem) For all Γ ⊆f L0 andformulae φ and ψ, Γ, φ ` ψ if and only if Γ ` φ→ ψ.

2The Deduction theorem justifies our usual notion of a directproof from the hypotheses of a conditional conclusion – the an-tecedent of the conditional is added to the assumptions and theconsequent is proven.




Proof of The Deduction Theorem (theorem 13.4)

Proof: (⇒). Assume Γ, φ ` ψ. Then there exists a proof tree T rooted at ψ with nodesψ1, . . . , ψm ≡ ψ. Then the following stronger claim proves the required result.

Claim. Γ ` φ→ ψi for all i, 1 ≤ i ≤ m.` By induction on k = `(ψ1)− `(ψi) in T (see definition 2.11 for `).

Basis. k = `(ψ1)−`(ψi) = 0. Then ψi is either a premise or an axiom. We have the followingcases to consider.Case ψi ≡ φ. Then the claim follows from reflexivity and monotonicity (theorem 13.1).Case ψi ∈ Γ or ψi is an axiom. In either case there exists a subtree Ti (of T ) rooted at ψiwhich may be used to construct the tree T ′

i as follows. Assume there are i′ steps in theproof of ψi.

Ti i′ ψi

i′ + 1ψi → (φ→ ψi)

i′ + 2 φ→ ψiwhich proves the claim.





Γ ` φ→ ψi for all i such that `(ψi) > l for some l ≥ `(ψm).

Induction Step. Since ψl is a non-leaf node it is neither an axiom nor a premise and musthave been obtained by virtue of the rule MP applied to its immediate successors say ψiand ψj with i 6= j such that `(ψi), `(ψj) > l. Without loss of generality we may assumeψj ≡ ψi → ψl.By the induction hypothesis, we know Γ ` φ→ ψi and Γ ` φ→ ψj. Hence there existproof trees T ′

i of i′ nodes rooted at φ→ ψi and T ′j of j′ nodes rooted at φ→ ψj ≡

φ→ (ψi → ψl) respectively. We construct the tree T ′l rooted at φ→ ψl from T ′

i and T ′j

as follows.

T ′j

j′φ→ (ψi → ψl)

j′ + 1(φ→ (ψi → ψl))→ ((φ→ ψi)→ (φ→ ψl))

j′ + 2(φ→ ψi)→ (φ→ ψl)

T ′i j′ + i′ + 2

φ→ ψij′ + i′ + 3

φ→ ψl

where j′ + 1 is an instance of S, and j′ + 2 and j′ + i′ + 3 are both applications of MP totheir respective immediate successors in the tree.




a

(⇐). Assume Γ ` φ→ ψ. Let T be a formal proof tree rooted at φ→ ψ with m nodes for somem > 0. By monotonicity (theorem 13.1) Γ, φ ` φ→ ψ is proven by the same tree. We may extendT to the tree T ′ by adding a new (m + 1)-st leaf node φ and creating the (m + 2)-nd root node ψ.

T m

φ→ ψm + 1

φm + 2

ψ

T ′ is a proof of Γ, φ ` ψ. QED




About Formal Proofs• Since a formal proof is a tree, it is acyclic (i.e. there is no

circularity in the proof).• Every formal proof is a finite tree i.e. a proof is a finitary

object.

Questions.1. What if the set of assumptions is infinite?2. Are there statements which have only infinite proofs and no

finite proof?










14. Lecture 14: Derived Rules

Lecture 14: Derived RulesTuesday 30 Aug 2011




1. Simplifying Proofs

2. Derived Rules

3. The Sequent Form

4. Proof trees in sequent form

5. Transitivity of Conditional

6. Derived Double Negation Rules

7. Derived Operators

8. Rules for Derived Operators




Simplifying Proofs• The deduction theorem allows “movement” of sub-formulae

between the set (sequence) of assumptions and the formulato be proven.•Hence the set (sequence) of formulae which form the as-

sumptions is an important part of the proof.•We use the notion of a sequent to formalize this movement

which may take place a any stage.

Definition 14.1 A sequent is a meta-formula of the form Γ ` φ.




Derived Rules• By substitutivity (theorem 13.1) we may simplify our proofs by

incorporating theorems and meta-theorems as derived rulesof our proof system.• These rules may be presented in sequent form.• The proof of the reflexivity may be rendered in sequent form

by simply pre-pending each node in the tree with “`”.• The Deduction Theorem and its converse may be rendered

in sequent form as a derived rule.•Reflexivity may be expressed in sequent form as a derived

rule.• These derived rules may be directly invoked in later proofs.




The Sequent FormLet Γ be a sequence of formulae.

K.Γ ` X → (Y → X)

N.Γ ` (¬Y → ¬X)→ ((¬Y → X)→ Y )

S.Γ ` (X → (Y → Z))→ ((X → Y )→ (X → Z))

MP.

Γ ` X → YΓ ` X

Γ ` Y

R→ .Γ ` X→X

DT⇐ .Γ ` X → Y

Γ, X ` YDT⇒ .

Γ, X ` YΓ ` X → Y




Proof trees in sequent formTheorem 14.2 For all φ, ψ and χ,

(φ→ ψ)→ ((ψ → χ)→ (φ→ χ))

Proof: Let Γ1 = φ→ ψ, Γ2 = Γ1, ψ → χ and Γ3 = Γ2, φ

Γ3 ` φ→ ψ Γ3 ` φMP Γ3 ` ψ Γ3 ` ψ → χ

MP Γ2, φ ` χDT⇒ Γ2 ` φ→ χ

DT⇒ Γ1 ` (ψ → χ)→ (φ→ χ)DT⇒ ` (φ→ ψ)→ ((ψ → χ)→ (φ→ χ))

QED




Transitivity of ConditionalFrom theorem 14.2 we get a derived axiom schema

T→ .Γ ` (X → Y )→ ((Y → Z)→ (X → Z))

But equivalently by applying the derived rule DT⇐ to T→ abovewe also get a derived rule of inference which is often more con-venient to use.

T⇒ .

Γ ` X → YΓ ` Y → Z

Γ ` X → Z




Exercise 14.1

1. Prove that each of the axiom schemas in H0 represents a collection of tautologies.

2. Prove that Modus Ponens in sequent form preserves logical consequence i.e. if Γ |= φ→ ψ andΓ |= φ then Γ |= ψ.

3. Using the above prove that the proof system H0 is sound i.e. If Γ `H0 ψ then Γ |= ψ.

4. Find the fallacy in the following proof of theorem 14.2. Assume Γ1, Γ2 and Γ3 are as in the proofof theorem 14.2.

Γ3 ` ψ → χDT⇒

Γ2 ` φ→ (ψ → χ)

SΓ2 ` (φ→ (ψ → χ))→ ((φ→ ψ)→ (φ→ χ)) Γ2 ` φ→ ψ

MPΓ2 ` (φ→ (ψ → χ))→ (φ→ χ)

MP Γ2 ` φ→ χDT⇒

Γ1 ` (ψ → χ)→ (φ→ χ)DT⇒ ` (φ→ ψ)→ ((ψ → χ)→ (φ→ χ))




5. Prove the following derived rule of inference (You may use any of the derived rules of inferencein addition to the usual proof rules).

R2⇒ .

Γ ` X → (Y → Z)

Γ ` YΓ ` X → Z

6. Could we have consequently reordered our theorems by first proving R2→ and then provingT→? Discuss whether there is anything fallacious in this approach.




Derived Double Negation RulesDNE.

Γ ` ¬¬XΓ ` X

DNI.Γ ` X

Γ ` ¬¬XThe following proof trees yield proofs of the derived doublenegation elimination and introduction rules respectively.Alternatively we may regard them as derived axiom schemas

DNE→ .Γ ` ¬¬X → X

DNI→ .Γ ` X → ¬¬X




Proof of derived rule DNE and axiom schema DNE→Proof:

K ¬¬φ→ (¬φ→ ¬¬φ)

N(¬φ→ ¬¬φ)→ ((¬φ→ ¬φ)→ φ) R⇒ ¬φ→ ¬φ

R2⇒(¬φ→ ¬¬φ)→ φ

T⇒ ¬¬φ→ φDT⇐ ¬¬φ ` φ

QED

Proof of derived rule DNI and axiom schema DNI→Proof:

Kφ→ (¬¬¬φ→ φ)

N(¬¬¬φ→ ¬φ)→ ((¬¬¬φ→ φ)→ ¬¬φ) DNE ¬¬¬φ→ ¬φ

MP(¬¬¬φ→ φ)→ ¬¬φ

T⇒ φ→ ¬¬φDT⇐ φ ` ¬¬φ

QED




Exercise 14.2

1. Prove the axiom schema

N′.(¬Y → ¬X)→ (X → Y )

A deduction theorem variant of this schema is also called the modus tollens rule or the contra-positive rule.

2. A variant of the system H0 is the system H ′0 obtained by replacing the schema N by N’.

(a) Prove the axiom schema N in the system H ′0 .

(b) Prove the double negation rules DNE and DNI in H ′0 .

3. Prove the following axiom schemas in H0. In each case you are allowed to use any version ofthe theorems previously proven.

(a) ⊥.¬X → (X → Y )

What can you conclude about the system H0 from your proof?

(b) N′′.(X → Y )→ (¬Y → ¬X)




(c) N2.X → (¬Y → ¬(X → Y ))

(d) C.(X → Y )→ ((¬X → Y )→ Y )

Derive the proof by cases rule Cases.

Γ, X ` Y

Γ,¬X ` Y

Γ ` Y(e) Derive the proof by contradiction also called the indirect proof method rule I in the system

H0.

I.

Γ, X ` ¬YΓ, X ` Y

Γ ` ¬X

4. Prove the derived axiom C2.Γ ` (¬X→X)→X




Derived Operators> df

= φ→ φ for all φ

⊥ df= ¬(φ→ φ) for all φ

φ ∨ ψ df= ¬φ→ ψ

φ ∧ ψ df= ¬(φ→ ¬ψ)

φ↔ ψdf= ¬((φ→ ψ)→ ¬(ψ → φ))

Several other binary and other operators of varying arities maybe defined.




Rules for Derived OperatorsCorresponding to each derived operator defined as

O(X1, . . . , Xn)df= ω(X1, . . . , Xn) where ω is constructed only

from the set ¬,→ we have the introduction and eliminationrules.

OE.Γ ` O(X1, . . . , Xn)

Γ ` ω(X1, . . . , Xn)

OI.Γ ` ω(X1, . . . , Xn)

Γ ` O(X1, . . . , Xn)




Gentzen’s SystemNatural Deduction

•Gentzen’s Natural Deduction system is not a minimal system,instead it is somewhat more natural in the sense that it hasexplicit introduction and elimination rules for each operator.•We present sequent version of the system in the following.• Further there is some redundancy in the rules. Not all the

rules may actually be useful, but they possess a pleasingsymmetry.•However, it is necessary to prove both the soundness and

the completeness of the system.•We refer to the system as G0.




Natural Deduction: 1Introduction Elimination

⊥ ⊥I. Γ ` X ∧ ¬XΓ ` ⊥

⊥E. Γ ` ⊥Γ ` X

> >I.Γ ` >

>E. Γ ` >Γ ` X ∨ ¬X





¬ ¬I. Γ, X ` ⊥Γ ` ¬X

¬E. Γ,¬X ` ⊥Γ ` X

¬¬ ¬¬I. Γ ` XΓ ` ¬¬X

¬¬E. Γ ` ¬¬XΓ ` X





∨ ∨I1. Γ ` XΓ ` X ∨ Y

∨I2. Γ ` YΓ ` X ∨ Y

∨E.

Γ ` X ∨ YΓ ` X → ZΓ ` Y → Z

Γ ` Z





∧ ∧I.

Γ ` XΓ ` Y

Γ ` X ∧ Y∧E1. Γ ` X ∧ Y

Γ ` X∧E2. Γ ` X ∧ Y

Γ ` Y




Natural Deduction: 5

Introduction Elimination

→ → I.Γ, X ` Y

Γ ` X → Y→ E.

Γ ` X → YΓ ` X

Γ ` Y

↔ ↔ I.

Γ ` X → YΓ ` Y → X

Γ ` X ↔ Y↔ E1.

Γ ` X ↔ Y

Γ ` X → Y↔ E2.

Γ ` X ↔ Y

Γ ` Y → X




Exercise 14.3

1. Prove the logical equivalences of P0 using the system H0.

2. Prove the non-obvious logical equivalences of P0 in the system G0.

3. Derive each of the rules of G0 from the system H0. You may use the rules OE and OI as andwhen needed for each operator.

4. Derive the axiom schemas K, S and N in G0.







15. Lecture 15: The Hilbert System: Soundness

Lecture 15: The Hilbert System:Soundness

Tuesday 06 September 2011




1. Formal Theory: Issues

2. Formal Theory: Incompleteness

3. Soundness of Formal Theories

4. Soundness of the Hilbert System

5. Soundness of the Hilbert System




Formal Theory: IssuesThe major questions concerning any formal theory are two-fold:Soundness. Is the theory sound? Especially considering that we

may not have well-defined models in which to test the truthor falsehood of statements.

Completeness. Is the theory complete? That is, is every factprovable from the axioms and inference rules of the theory?




Formal Theory: IncompletenessSuppose a given formal theory is incomplete. There are severalpossibilities.Question 1. Can the theory be made complete by adding or

changing some axioms and inference rules (without makingthe theory inconsistent)?

Question 2. Is the theory inherently incomplete? That is, is thereno way of achieving completeness by adding only a finitenumber of new axioms and inference rules?




Soundness of Formal TheoriesGiven that the proof theory may be the only finitary tool avail-able to us in reasoning about some domain we need to definethe notion of consistency of the theory in terms of the proof-theoretic notions.Definition 15.1 A formal theory is unsound if every wff is a the-orem. Otherwise it is said to be sound.




Soundness of the Hilbert SystemLemma 15.21. Every instance of every axiom schema in H0 is a tautology.2. The Modus Ponens rule MP preserves tautologousness.

2A truth table technique would serve the purpose for H0 alonebut would not be possible when H0 is extended to H1.




Proof of lemma 15.2Proof:

1. We prove the case of any instance of the axiom schema K. We needto show that for all φ and ψ, φ→ (ψ → φ) is a tautology. Supposeit is not a tautology. Then there exists a truth assignment τ suchthat T Jφ→ (ψ → φ)Kτ = 0 which is possible only if T JφKτ = 1and T Jψ → φKτ = 0 which in turn is possible only if T JψKτ = 1and T JφKτ = 0 which is impossible. Hence there is no such truthassignment. So φ→ (ψ → φ) must be a tautology.A similar reasoning may be applied to the axiom schemas S and N.

2. Assume for some φ and ψ that φ and φ→ ψ are tautologies but ψis not. It is easy to see that for any truth assignment τ , T JψKτ = 0




implies T JφKτ = 0 contradicting the assumption that φ is a tautol-ogy.

QED




Soundness of the Hilbert SystemTheorem 15.3 Every formal theorem of H0 is a tautology.

The theorem follows by induction on the heights of proof trees,since every leaf would be a tautology and every internal nodepreserves tautologousness.Corollary 15.4 The system H0 is sound.










16. Lecture 16: The Hilbert System: Completeness

Lecture 16: The Hilbert System:Completeness

Wednesday 07 Sep 2011




1. Towards Completeness

2. Towards Truth-tables

3. The Truth-table Lemma

4. The Completeness Theorem




Towards Completeness1. By theorem 15.3 the only theorems of the system H0 are

tautologies.2. By theorems 4.2 and 4.3 the question of completeness of H0

reduces to that of whether every tautology of P0 is provable inH0.

3. If H0 is complete then by exercise 14.3.4, G0 is also com-plete.




Towards Truth-tables1. Restricting ourselves to showing that every tautology is prov-

able in H0 is sufficient.2. But we proceed to show that every truth table can be simu-

lated as a proof in H0, thereby capturing all of the semanticfeatures of the language P0 in its proof theory.




The Truth-table LemmaLemma 16.1 Let φ be a formula with atoms(φ) ⊆ p1, . . . , pk.For each truth assignment τ ,

p∗1, · · · , p∗k ` φ

∗

where for each i, 1 ≤ i ≤ k,

p∗i ≡pi if τ (pi) = 1¬pi otherwise

andφ∗ ≡

φ if T JφKτ = 1¬φ otherwise

2




Proof of lemma 16.1

Proof: By induction on the number n of operators in φ. Let Γ = p∗1, . . . , p∗k.

Basis. n = 0. Then φ is an atom say, φ ≡ p1. The claim then trivially follows since p1∗ ≡ φ∗.


The claim holds for all wffs with less than n ≥ 0 occurrence of the operators.

Induction Step. Suppose φ is a wff with n operators. Then there are two cases to consider.Case φ ≡ ¬ψ, where ψ has less than n operators. Then by the induction hypothesis we have

have a proof tree T1 Γ ` ψ∗ .

Subcase T JψKτ = 1. Then T JφKτ = 0 and ψ∗ ≡ ψ and φ∗ ≡ ¬φ ≡ ¬¬ψ. Then we have thefollowing deduction.

DNI→ Γ ` ψ → ¬¬ψ T1 Γ ` ψ

MP Γ ` ¬¬ψ ≡ φ∗




Subcase T JψKτ = 0. Then T JφKτ = 1 and ψ∗ ≡ ¬ψ and φ∗ ≡ φ ≡ ¬ψ. By the inductionhypothesis we have Γ ` ¬ψ ≡ φ∗.Case φ ≡ ψ → χ. where each of ψ and χ has less than n operators. By the induction hypothesis

there exist proof trees T1 Γ ` ψ∗ and

T2 Γ ` χ∗ . Here again we have three subscases.

Subcase T JψKτ = 0. We have ψ∗ ≡ ¬ψ so tree T1 is T1 Γ ` ¬ψ , T JφKτ = 1 and φ∗ ≡ φ ≡

ψ → χ. We then have the proof tree

⊥Γ ` ¬ψ → (ψ → χ)

T1 Γ ` ¬ψ

MP Γ ` ψ → χ ≡ φ∗

which proves the claim.Subcase T JχKτ = 1. Then χ∗ ≡ χ and T JφKτ = 1 and φ∗ ≡ φ ≡ ψ → χ. Hence tree T2 is T2

Γ ` χ . We may then construct the following proof tree to prove the claim.




KΓ ` χ→ (ψ → χ)

T2 Γ ` χ

MP Γ ` ψ → χ ≡ φ∗

Subcase T JψKτ = 1 and T JχKτ = 0. Then ψ∗ ≡ ψ and χ∗ ≡ ¬χ and T JφKτ = 0 from which

we get φ∗ ≡ ¬(ψ → χ). By induction hypotheses we therefore have the trees T1

Γ ` ψ and

T2 Γ ` ¬χ using which we construct the following proof tree to prove our claim.

N2Γ ` ψ → (¬χ→ ¬(ψ → χ))

T1 Γ ` ψ

MPΓ ` ¬χ→ ¬(ψ → χ)

T2 Γ ` ¬χ

MPΓ ` ¬(ψ → χ) ≡ φ∗

QED




The Completeness TheoremWe are now ready to prove the completeness theorem by re-stricting it to all the tautologies of propositional logic.Theorem 16.2 (The Completeness Theorem). Every tautology isa formal theorem of H0.

2




Proof of the Hilbert Completeness Theorem 16.2

Proof: Let φ be a tautology expressed in the language L0 and let atoms(φ) = p1, . . . , pk.Since every row of the truth-table for φ assigns it a truth value 1 we have φ∗ ≡ φ. Each bit-stringsk ∈ 0, 1k indexes a row of the truth table containing 2k distinct rows. Let Γsk = p∗i | 1 ≤i ≤ k denote the set of assumptions for the sk-th row of the truth table. By the truth table lemma

(lemma 16.1), there exists a distinct proof tree, Tsk Γsk ` φ

for each such sk. For each bit-string

sj ∈ 0, 1j, 0 < j ≤ k, we construct the proof tree Tsj−1

Γsj−1

` φ from the proof trees Tsj−10 Γsj−10 ` φ

and Tsj−11 Γsj−11 ` φ

, where sj−10 and sj−11 are the two bit-strings of length j, which differ only in

the right-most bit.

Note that s0 = ε is the empty string, Γε = ∅ and we need to construct the proof tree Ts0 Γs0 ` φ

from

the 2k distinct proof trees Tsk Γsk ` φ

.

Now consider any two proof trees whose indexes differ only in the rightmost bit. That is, for any




sj−1 ∈ 0, 1j−1, we have the proof trees Tsj−11 Γsj−11 ` φ

and Tsj−10 Γsj−10 ` φ

. We construct the proof

tree Tsj−1

Γsj−1

` φ as follows.

Tsj−10 Γsj−10 ` φ

DT⇒ Γsj−1` ¬pj → φ

Tsj−11 Γsj−11 ` φ

DT⇒ Γsj−1` pj → φ

CΓsj−1

` (pj → φ)→ ((¬pj → φ)→ φ)MP

Γsj−1` (¬pj → φ)→ φ

MP Γsj−1` φ

We can thus eliminate the atom pj from the assumptions by applying the above proof procedure toall pairs of proof trees whose assumptions differ only in the value of p∗j .

Thus the 2k proof trees are combined pairwise to produce 2k−1 proof trees that are independent ofthe atom pk. Proceeding in a like manner we may eliminate all the atoms one by one by using similar

proof constructions so that finally we obtain a single monolithic proof tree Tε Γε ` φ

where Γε = ∅,

thus concluding the proof that the tautology φ is a formal theorem of H0. QED







17. Lecture 17: Introduction to Predicate Logic

Lecture 17: Introduction to PredicateLogic

Friday 09 Sep 2011




1. Predicate Logic: Introduction-1



4. Predicate Logic: Symbols

5. Predicate Logic: Signatures

6. Predicate Logic: Syntax of Terms

7. Predicate Logic: Syntax of Formulae

8. Precedence Conventions

9. Predicates: Abstract Syntax Trees

10. Subterms

11. Variables in a Term

12. Bound Variables And Scope

13. Bound Variables And Scope: Example

14. Scope Trees

15. Free Variables

16. Bound Variables

17. Closure




Predicate Logic: Introduction-1Example 17.1 There are many kinds of arguments which can-not be proven in propositional logic and which require the notionof quantifiers. The most famous one perhaps containing onlyvery basic and simple declarative statements is

All humans are mortal.Socrates is a human.Therefore Socrates is mortal.




Predicate Logic: Introduction-2The validity of this argument does not depend on any proposi-tional connectives since there are none. Hence it is not provablein propositional logic.However it is valid and its validity depends upon• the internal structure of the sentences,• the meaning of certain operative words and phrases such as

“All”• certain properties of objects e.g. “mortal”• the description of certain classes by their properties and

membership or other relations on these classes e.g. “is a”.




Predicate Logic: Introduction-31. The deeper relationships that exist between otherwise simple

propositions require parametrisation of propositions so thatthe inter-relationships become clear.

2. Parametrised propositions are called predicates.3. Each predicate specifies either a property (1-ary predicates)

or a relationship between objects (n-ary predicates).4. The propositions of propositional logic are 0-ary predicates.5. Mathematical theories are often about collections of infinite

objects whose relationships and inter-relationships are finiteexpressions involving functions, relations and expressions in-volving them.




Predicate Logic: Symbols• V: a countably infinite collection of variable symbols;x, y, z, . . . ∈ V.• F: a countably infinite collection of function symbols;f, g, h, . . . ∈ F.• A: a countably infinite collection of atomic predicate symbols;p, q, r, . . . ∈ A.•Grouping symbols: (, ), (, ), [, ].• All of the above sets are pairwise disjoint.




Predicate Logic: SignaturesDefinition 17.2 A signature (or more accurately 1-sorted signa-ture) Σ is a denumerable (finite or countably infinite) collectionof strings of the form

f : sm→ s,m ≥ 0

orp : sn, n ≥ 0

such that there is at most one string for each f ∈ F and eachp ∈ A. m and n are respectively the arity of f and p.

Here s is merely a symbol signifying a sort of elements. A gen-eralization to many-sorted algebras would require the use of asmany such symbols s1, . . . , sm as there are sorts.




Predicate Logic: Syntax of TermsDefinition 17.3 Given a signature Σ, the set T(Σ) of Σ-terms isdefined inductively by the following grammar

s, t, u ::= x ∈ V | f (t1, . . . , tm)

where f : sm → s ∈ Σ and t1, . . . , tm ∈ T(Σ). If m = 0 then f ()is called a constant and simply written f . We usually use thesymbols a, b, c, . . . to denote constants.




Predicate Logic: Syntax of FormulaeDefinition 17.4 Given a signature Σ and the set T(Σ) of Σ-terms.• A Σ-atomic formula or Σ-atom is a string of the formp(t1, . . . , tn) where p : sn ∈ Σ. A(Σ) is the set of Σ-atoms.• Ω1 = Ω0 ∪ ∀x,∃x | x ∈ V• The set of P1(Σ) of Σ-formulas is defined inductively by the

following grammar.

φ, ψ ::= ⊥ | >| p(t1, . . . , tn) ∈ A(Σ) | (¬φ)| (φ ∧ ψ) | (φ ∨ ψ)| (φ→ ψ) | (φ↔ ψ)| ∀x[φ] | ∃x[φ]




Precedence Conventions• The operator precedence conventions are as before.• The two new operators are called the universal quantifier (∀)

and existential quantifier (∃) respectively and are parame-terised by variables.• The scope of the (variable in a) quantified formula is delimited

by the the matching pair of brackets ([ and ]).• If a formula φ is preceded by several quantifiers (e.g.∀x[∃y[∀z[φ]]]) we collapse the scoping brackets where thereis no ambiguity (e.g ∀x∃y∀z[φ]).•We will think of both Σ-terms and Σ-formulae as abstract syn-

tax trees. The brackets delimiting the scope of a quantifedvariable then become redundant.




Predicates: Abstract Syntax TreesExample 17.5 Let p, q and r be unary predicates. The first-order logic formula ∀x[p(x) ∧ ∃x[q(x)]] ∨ r(x)

x

x

x

∨

r∀x

∧

p ∃x

q




SubtermsDefinition 17.6 For each term t, ST (t) denotes the set of sub-terms of t (including t itself). The set of proper subterms of t isthe set ST (t)− t.

t depth size STc() 1 1 tx 1 1 tf (t1, . . . , tn) 1+ 1+ t∪

Maxni=1depth(ti)∑ni=1 size(ti)

⋃ni=1 ST (ti)




Variables in a TermFor any term t, V ar(t) denotes the set of all variables that oc-cur in t. These functions may be defined by induction on thestructure of terms as follows.Definition 17.7

t V ar(t)c() ∅x xf (t1, . . . , tn)

⋃1≤i≤n V ar(ti)




Bound Variables And ScopeDefinition 17.8 In a formula of the form Qx[ψ] the variable x issaid to be bound by the quantifier Q. The brackets [· · · ] delimitthe scope of the binding.

Example 17.9 In the abstract syntax tree of the predicate

∀x[p(x) ∧ ∃x[q(x)]] ∨ r(x)

the scopes of the various bindings are indicated by dashedboxes. Notice that the red scope is a “hole” in the blue scope.




Bound Variables And Scope: Example

x

x

x

∨

r∀x

∧

p ∃x

q




Scope TreesThe abstract syntax tree also determines the scope of the indi-vidual bound variables.

x

x

x

∨

r∀x

∧

p ∃x

q




Free VariablesDefinition 17.10 For any predicate φ the set of free variablesoccurring in it (denoted FV (φ)) and the set of sub-formulae(denoted SF (φ) are defined by induction on the structure ofpredicates.φ FV (φ) SF (φ) Conditionp(t1, . . . , tn)

⋃1≤i≤n

V ar(ti) p(t1, . . . , tn)

¬ψ FV (ψ) ¬ψ ∪ SF (ψ)o(ψ, χ) FV (ψ) ∪ FV (χ) o(ψ, χ) ∪ SF (ψ) o ∈ Ω0 − ¬

∪ SF (χ)

Qx[ψ] FV (ψ)− x Qx[ψ] ∪ SF (ψ) Q∈ ∀,∃




Bound VariablesDefinition 17.11 If Qx[ψ] is a sub-formula of some formula φthen ψ is said to be the scope of the quantifier Qx and everyfree occurrence of the variable x in the formula ψ is said to bebound in the scope of the quantifier Qx in which it occurs.

Notice that if Qx[χ] is a sub-formula of ψ in the definition 17.11then any x ∈ FV (χ) is not a free variable of ψ.We may write φ(x1, . . . , xm) to indicate that FV (φ) ⊆x1, . . . , xm.




ClosureDefinition 17.121. A formula φ is called closed if FV (φ) = ∅.2. The universal closure of φ(x1, . . . , xm) denoted ~∀[φ] is defined

as the formula ∀x1, . . . , xm[φ].

3. The existential closure of φ(x1, . . . , xm) denoted ~∃[φ] is definedas the formula ∃x1, . . . , xm[φ].

4. A literal is an atomic formula or its negation. For any literalλ, λ will denote its negation.




Exercise 17.1

1. Translate the following statements into first-order logic statements. (You may use the functionsymbols that are normally used in mathematics, e.g. “0” for zero, “=” for equality, “+” foraddition etc.). The names x, y etc. stand for variables.

• Every number has a unique successor.

• Not every number has a predecessor.

• The sum of any two odd numbers is even.

• x is a prime.

• There is no largest prime.

• x is a divisor of y.

• x and y are relatively prime.

• Define the notion of greatest common divisor of two numbers as as a ternary predicategcd(x, y, z) in terms of the previous parts. In other words, gcd(x, y, z) stands for the state-ment




z is the greatest common divisor of x and y

2. Symbolize the following arguments in first order logic You may assume in each case that the uni-verse of discourse contains the various relations mentioned as predicates (and nothing more!).

• There is a man whom all men despise. Therefore at least one man despises himself.• All hotels are expensive and depressing. Some hotels are shabby. Therefore some expensive

hotels are shabby.• Anyone who is loved loves everyone. No one loves Charlie Brown. Therefore no one loves

anyone.• Whoever visited the building was observed. Anyone who had observed Ajay, would have

remembered him. Nobody remembered Ajay. Therefore Ajay did not visit the building.• If all drugs are contaminated then all negligent technicians are scoundrels. If there are any

drugs that are contaminated then all drugs are contaminated and unsafe. All pesticides aredrugs. Only the negligent are absent-minded. Therefore if any technician is absent-minded,then if some pesticides are contaminated, then he is a scoundrel.• Some criminal robbed the mansion. Whoever robbed the mansion broke in or had an ac-

complice among the servants. To break in one would have to smash the door or pick the lock.Only an expert locksmith could have picked the lock. Had anyone smashed the door, he would




have been heard. Nobody was heard. If the criminal who robbed the mansion managed tofool the guard, he must have been a convincing actor. Nobody could rob the mansion unlesshe fooled the guard. No criminal could be both an expert locksmith and a convincing actor.Therefore some criminal had an accomplice among the servants.







18. Lecture 18: The Semantics of Predicate Logic

Lecture 18: The Semantics ofPredicate LogicWednesday 14 Sep 2011




1. Structures

2. Notes on Structures

3. Infix Convention

4. Expansions and Reducts

5. Valuations and Interpretations

6. Evaluating Terms

7. Coincidence Lemma for Terms

8. Variants

9. Variant Notation

10. Semantics of Formulae

11. Notes on the Semantics




StructuresGiven a signature Σ, a Σ-structure or Σ-algebra A consists of• a non-empty set A = |A| called the domain (or carrier or

universe) of A,• a function fA : Am → A for each m-ary function symbolf ∈ Σ (including symbols for each constant)• a relation pA ⊆ An for each n-ary atomic predicate symbolp ∈ Σ and• (for completeness) a truth value pA ∈ 2 = 0, 1 for each

(0-ary) atomic proposition p ∈ Σ.When the Σ-algebra is understood or is the only structure underconsideration, we omit the subscript A from the functions andrelations.




Notes on Structures1. The domain has to be non-empty.2. All functions are total.3. One way to deal with partial functions (e.g. division on nat-

ural numbers) of arity m > 1 is to treat them as (m + 1)-aryrelations and define predicate symbols to represent them.




Infix ConventionIf in particular structures, functions or relations are normallywritten in infix form, then we use the infix form in the logicallanguage too.Example 18.1 If N = 〈N; +;<〉 the set of natural numbers underthe binary operation of addition (+) and the binary relation less-than (<) is the structure, then we write predicate formulae usingthe corresponding symbols in the language in infix form. Forexample, the formula

∀x[∃y[x < x + y]]

with the operation in infix form is more easily understood inplace of the more pedantic

∀x[∃y[< (x,+(x, y))]]




Expansions and ReductsDefinition 18.2 Let Σ ⊆ Ω be signatures. A Σ-structure A iscalled a reduct to an Ω-structure B if for all f, p ∈ Σ iff• |A| = |B|,• fA = fB for each f ∈ Σ and• pA = pB for each p ∈ Σ

B is also called an expansion of A and is denoted AB




Valuations and InterpretationsTo be able to define the truth or falsehood of a predicate withfree variables, it is necessary to be able to first define a valua-tion for the variables.

Definition 18.3 Given a Σ-structure A, a valuation is a functionvA : V −→ |A| which assigns to each variable a unique valuein |A|.Definition 18.4 Given a Σ-structure A and a valuationvA : V −→ |A|, (A, vA) is called a Σ-interpretation.

The subscript “A” is often omitted when the context makes clearthe structure that is being referred to.




Evaluating TermsDefinition 18.5 Given Σ-interpretation (A, v) the value of a termt in the interpretation is defined by induction on the structure ofthe term.

VAJxKvdf= v(x), x ∈ V

VAJf (t1, . . . , tm)Kvdf= fA(VJt1Kv, . . . ,VJtmKv), f : sm→ s ∈ Σ

Notational conventions.1. If V ar(t) ⊆ x1, . . . , xm, we sometimes write t(x1, . . . , xm) to

denote this fact.2. The subscript “A” is often omitted when the context makes

clear the structure that is being referred to.




Coincidence Lemma for TermsThe following lemma may be easily proved by induction on thestructure of terms.

Lemma 18.6 Given two Σ-interpretations (A, v) and (A, v′), anda term t, if v(x) = v′(x) for each x ∈ V ar(t), then VJtKv = VJtKv′

Notational convention.If t(x1, . . . , xm), v(x1) = a1, . . . , v(xm) = am for a1, . . . , am ∈ |A|in some Σ-interpretation (A, v), then we write tA(a1, . . . , am) in-stead of VJt(x1, . . . , xm)Kv, since only the values of the variablesoccurring in t are needed to evaluate it.




VariantsDefinition 18.7 Two valuations v, v′ : V −→ |A| are said to beX-variants of each other (denoted v =\X v

′ for any X ⊆ V if forall y ∈ V−X, v(y) = v′(y), i.e. they differ from each other at mostin the values for variables from X.

Fact 18.81. For any X ⊆ V and valuation v, v is an X-variant of itself i.e.v =\X v.

2. =\X is an equivalence relation on valuations.




Variant NotationNotation.1. When X = x is a singleton we refer to v and v′ as x-variants

and denote it by v =\x v′.

2. If vx =\x v and vx(x) = a ∈ |A| we write vx = v[x := a].

3. If X = x1, . . . , xn, vX =\X v and vX(xi) = ai ∈ |A|, for eachi, 1 ≤ i ≤ n, then we write vX = v[x1 := a1, . . . , xn := an].




Semantics of FormulaeLet (A, vA) be a Σ-interpretation. Then T JφKv is defined byinduction on the structure of φ. We omit the propositional con-nectives as being obvious and concentrate only on the otherconstructs.

TAJp(t1, . . . , tn)KvAdf=

1 if (VAJt1KvA, . . . ,VAJtnKvA) ∈ pA0 otherwise

TAJ∀x[φ]KvAdf=∏TAJφKv′A

| v′A =\x vA

TAJ∃x[φ]KvAdf=∑TAJφKv′A

| v′A =\x vADefinitions of

∑,∏

The subscript “A” is often omitted when the context makes clearthe structure that is being referred to.




Notes on the Semantics1. Truth value is now evaluated under a valuation instead of a

truth assignment.2. A valuation defines a truth assignment to parameterised proposi-

tions depending upon the values of the parameters.3. The set TAJφKv′A

| v′A is an x-variant of vA is nonemptysince vA is also an x-variant of itself.

4. The number of x-variants of v equals the cardinality of |A|.5. Even though there may be infinitely many x-variants of a val-

uation vA, the set TAJφKv′A| v′A is an x-variant of vA can-

not have more than two elements. Hence the sum and prod-uct are both finitary.




Exercise 18.1

1. Define interpretations on universes of discourse containing at least 3 elements and give a valu-ation in which the following hold. Assume p and q are atomic predicate symbols of any positivearity of your choice.

(a) ¬(p ∧ ∃x[q]→ ∃x[p ∧ q])(b) ¬(∃x[p→ q]→ (∀x[p]→ q))

2. Prove that the following predicates have no models at all where p and q are atomic predicates(of any positive arity).

(a) ¬(p→ ∀x[q]→ ∀x[p→ q])

(b) ¬((∀x[p]→ q)→ ∃x[p→ q])

3. Consider the universe of discourse to be the set of all nodes of graphs and let the atomic binarypredicate symbol e stand for the edge relation on nodes, i.e. e(x, y) stands for there is an edgefrom x to y. Further let “=” stand for the usual identity relation on nodes. What properties ongraphs do the following first-order predicates define?




(a) ∀x[∃y[¬(x = y) ∧ e(x, y)]]

(b) ∀x[∀y[e(x, y)→ ¬(x = y)]]

(c) ∀x[∀y[¬(x = y)→ (e(x, y)→ e(y, x))]]

(d) ∀x[∀y[∀z[e(x, y) ∧ e(y, z)→ e(x, z)]]]







19. Lecture 19: Substitutions

Lecture 19: SubstitutionsFriday 16 September 2011




1. Coincidence Lemma for Formulae

2. Substitutions

3. Instantiation of Terms

4. The Substitution Lemma for Terms

5. Admissibility

6. Instantiations of Formulae

7. The Substitution Lemma for Formulae




Coincidence Lemma for FormulaeThe following lemma analogous to lemma 18.6 may be provedby induction on the structure of formulae.

Lemma 19.1 Given two Σ-interpretations (A, v) and (A, v′),and a formula φ, if v(x) = v′(x) for each x ∈ FV (φ), thenT JφKv = T JφKv′.

2




Proof of The Coincidence Lemma for formulae (lemma 19.1)

Variant Notation Semantics of Formulae

Proof: By induction on the structure of φ. However the interesting cases are those of atomicpredicates and quantified formulae.

Case φ ≡ p(t1, . . . , tn) where p is an n-ary predicate symbol. For each x ∈ FV (p(t1, . . . , tn)) =⋃1≤i≤n

V ar(ti) we have v(x) = v′(x). Hence for each ti we have VJtiKv = VJtiKv′ from which we get

T Jp(t1, . . . , tn)Kv = T Jp(t1, . . . , tn)Kv′.

Case φ ≡ Qx[ψ], Q∈ ∀,∃. Assume ψ ≡ ψ(x1, . . . , xn). We consider two cases.

Sub-case x 6∈ FV (ψ). Then x 6∈ x1, . . . , xn and hence for every vx and v′x which are x-variants ofv and v′ respectively we have T Jψ(x1, . . . , xn)Kvx = T Jψ(x1, . . . , xn)Kv′x from which we obtain∏

T JψKvx | vx =\x v =∏T JψKv′x | v

′x =\x v

′and∑

T JψKvx | vx =\x v =∑T JψKv′x | v

′x =\x v

′

which implies T JφKvx = T JφKv′x.




Sub-case x ∈ FV (ψ). Then FV (ψ) = x, x1, . . . , xn. Let T (x) = T JψKvx | vx =\x v andT ′(x) = T JψKv′x | v

′x =\x v

′. Note that T (x), T ′(x) ∈ 0, 1, 0, 1. Assume for some⊙∈ ∏,∑, T JφKvx 6= T JφKv′x. Then T (x) 6= T ′(x) which implies there exists a ∈ A = |A|

such that for vx = v[x := a] and v′x = v′[x := a], T JψKvx 6= T JψKv′x. But this is impossiblesince FV (ψ) = x, x1, . . . , xn, vx(x) = a = v′x(x) and for each xi ∈ x1, . . . , xn, we havevx(xi) = v′x(xi). Hence T JφKvx = T JφKv′x. QED




SubstitutionsDefinition 19.2• A substitution θ is a (total) function θ : V → T(Σ) which is

almost everywhere the identity.• SΩ1

(V) is the set of all substitutions.• θ is called a ground substitution if FV (θ(x)) = ∅.• The domain of a substitution θ is the finite set dom(θ) =x | x 6≡ θ(x). θ acts on the variables in dom(θ).

Notes and notation.• Equivalently a substitution θ may be represented as a finite

(possibly empty) set θ = s/x | θ(x) = s 6≡ x containingonly the non-identical elements and their images under θ.




Instantiation of TermsDefinition 19.3 Let θ be a substitution.• The application of θ to a term t ∈ T(Σ) is denoted θt and

defined inductively as follows.

θy = y, y 6∈ dom(θ)θx = θ(x), x ∈ dom(θ)θf (t1, . . . , tm) = f (θt1, . . . , θtm), f : sm→ s ∈ Σ

• θt is called a (substitution) instance of t.




The Substitution Lemma for TermsLemma 19.4 Given a Σ-interpretation (A, v), a term t and asubstitution s/x, let VJsKv = a ∈ |A|. Then

VJs/xtKv = VJtKv[x:=a]

2




Proof of The Substitution Lemma for Terms 19.4

Proof: By induction on the structure of t.

Case t ≡ x. Then s/xx ≡ s and we have VJs/xtKv = VJsKv = a = VJtKv[x:=a].

Case t ≡ y 6≡ x. Then s/xy ≡ y and VJs/xtKv = VJyKv = VJtKv[x:=a] since v(y) = v[x := a](y.

Case t ≡ f (t1, . . . , tm). Then

VJs/xtKv= VJs/xf (t1, . . . , tm)Kv= VJf (s/xt1, . . . , s/xtm)Kv= fA(VJs/xt1Kv, . . . ,VJs/xtmKv)= fA(VJt1Kv[x:=a], . . . ,VJtmKv[x:=a]) By the induction hypothesis= VJf (t1, . . . , tm)Kv[x:=a]

QED




AdmissibilityThe occurrence of bound variables in formulae requires carefulhandling when substitutions are applied to formulae. Intuitively,an element s/x ∈ θ is admissible in a formula φ if the variablesof s remain free after instantiating the formula.Definition 19.5 Let θ be a substitution• An element s/x ∈ θ is admissible in

– p(t1, . . . , tn),– ¬φ if it is admissible in φ,– (φ ψ) if it is admissible in both φ and ψ– Qx[φ],– Qy[φ] if x 6≡ y, y 6∈ FV (s) and s/x is admissible in φ.• θ is admissible in φ if every element of θ is admissible in φ.




Instantiations of FormulaeDefinition 19.6 Let θ be a substitution.• The application of θ to a formula φ ∈ P1(Σ) is denoted θφ and

defined inductively as follows.

θp(t1, . . . , tn) = p(θt1 . . . , θtn), p : sn ∈ Σθ¬ψ = ¬(θφ),θ(ψ χ) = (θψ θχ), ∈ ∧,∨,→,↔θ Qx[ψ] = Qx[θ′ψ], θ′ = θ − θ(x)/x, Q∈ ∀,∃




The Substitution Lemma for FormulaeLemma 19.7 Given a Σ-interpretation (A, v), a formula φ andan admissible substitution s/x, let VJsKv = a ∈ |A|. ThenT Js/xφKv = T JφKv[x:=a].

2




Proof of The Substitution Lemma for Formulae 19.7

Proof: Case x 6∈ FV (φ). Then it trivially follows that s/xφ ≡ φ and T Js/xφKv = T JφKvx forall x-variants of v.

Case x ∈ FV (φ). We proceed by induction on the structure of φ.

Sub-case φ ≡ p(t1, . . . , tn). We have

T Js/xφKv= T Js/xp(t1, . . . , tn)Kv= T Jp(s/xt1, . . . , s/xtn)Kv= T Jp(t1, . . . , tn)Kv[x:=a] By lemma 19.4

The sub-cases involving the propositional connectives are trivial and the only interesting cases leftare those of quantified formulae.

Sub-case φ ≡ Qy[ψ]. Since x ∈ FV (φ) clearly x 6≡ y and hence x ∈ FV (ψ). Further since s/xis admissible in φ, y 6∈ V ar(s). This implies that

1. s/x is admissible in ψ,

2. s/xφ ≡ Qy[s/xψ] and




3. a = VJsKv = VJsKv[y:=b] for arbitrary b ∈ |A| since y 6∈ V ar(s).

By the induction hypothesis for any b ∈ |A| we have

T Js/xψKv[y:=b] = T JψKv[y:=b][x:=a] = T JψKv[x:=a][y:=b]

from which we get

T Js/xψKvy | vy =\y v = T JψKv[x:=a]y | v[x := a]y =\y v[x := a]

which implies for any quantifier T Js/xφKv = T JφKv[x:=a] regardless of the quantifier involved.QED

Exercise 19.1

1. Let (A1, v1) be a Σ1-interpretation and (A2, v2) a Σ2-interpretation such that |A1| = |A2|. LetΣ = Σ1 ∩ Σ2.

(a) Let t be a Σ-term. Prove that if (A1, v1) and (A2, v2) agree on the symbols occurring in t,then VA1JtKv1 = VA2JtKv2.

(b) Let φ be a Σ-formula. Prove that (A1, v1) and (A2, v2) agree on the symbols occurring in φ,then TA1JφKv1 = TA2JφKv2.







20. Lecture 20: Models, Satisfiability and Validity

Lecture 20: Models, Satisfiability andValidity





1. Satisfiability

2. Models and Consistency

3. Examples of Models:1



6. Logical Consequence

7. Validity

8. Validity of Sets of Formulae

9. Negations of Semantical Concepts




SatisfiabilityDefinition 20.1 (Satisfaction).• A Σ-interpretation (A, v) satisfies a Σ-formula φ denoted

(A, v) φ if and only if T JφKv = 1.• φ is said to be satisfiable in A if there is a valuation v such

that (A, v) φ.• A Σ-formula φ is satisfiable if there exists a Σ-interpretation

that satisfies it.




Models and ConsistencyDefinition 20.2 (Models).• A Σ-structure A is a model (or more accurately a Σ-model) of

a Σ-formula φ ∈ P1(Σ) (denoted A φ) if and only if for allvaluations v, (A, v) φ.• For a nonempty set Φ of Σ-formulas,

– a Σ-structure A is a Σ-model of Φ, (denoted A Φ) if andonly if it is a model of every formula in Φ.

– Φ is said to be an axiom system for all models of Φ.– Φ is said to be consistent iff it has a Σ-model.




Examples of Models:1Example 20.3 Let Σ = 0 :−→ s,+1 : s −→ s,= : s2, < : s2, let

N = 〈N; 0,+1; =, <〉

be the Σ-structure whereN is the set of naturals, +1 is the unarysuccessor function, = is the atomic binary equality predicateand < is the binary “less-than” predicate. Let

φ1df= ¬(+1(x) = 0) φ2

df= (+1(x) = +1(y))→ (x = y)

φ3df= ∀x∃y[y = +1(x)] φ4

df= ∀x[¬(x = 0)→ ∃y[x = +1(y)]]

We have N Φ where Φ = φ1, φ2, φ3, φ4




Examples of Models:2Example 20.4 Let Σ = 0 :−→ s,+ : s2 −→ s,= : s2 and let Zbe the Σ-structure

Z = 〈Z; 0,+; =〉where Z is the set of integers, 0 and + represent the integerzero and the binary addition operation respectively, and = is theatomic binary equality predicate. Z is a model of the followingset Φ of formulae

φassociativedf= ∀x, y, z[(x + y) + z = x + (y + z)] (10)

φidentitydf= ∀x[x + 0 = x] (11)

φright−inversedf= ∀x∃y[x + y = 0] (12)




Examples of Models:3Example 20.5 The set Φ defined in the previous example is theset of axioms which defines the notion of a group in algebra.The addition of an extra axiom

φcommutativedf= ∀x, y[x + y = y + x] (13)

i.e. Φ′ = Φ ∪ φcommutative excludes all non-commutativegroups from the models of the set Φ′.




Logical ConsequenceDefinition 20.6 A Σ-formula ψ is a logical consequence of a setΦ of Σ-formulae, denoted Φ |= ψ, if and only if every model Aof Φ is also a model of ψ. If Φ is empty then ψ is said to belogically valid (denoted |= ψ).

Example 20.7 Let Σ be the signature in example 20.4 and let

the formula φleft−inversedf= ∀x∃y[y + x = 0]. A typical proof in a

mathematics text that φleft−inverse is a logical consequence ofthe axioms of group theory (example 20.4) might go as follows.




Proof: Let x be any element. Then by axiom φright−inverse there exists y such that

x + y = 0 (14)

Again by φright−inverse for some z we get

y + z = 0 (15)

We then havey + x = (y + x) + 0 (φidentity)

= (y + x) + (y + z) (15)

= y + (x + (y + z)) (φassociative)

= y + ((x + y) + z) (φassociative)

= y + (0 + z) (14)

= (y + 0) + z (φassociative)

= y + z (φidentity)

= 0 (15)

QED

Effectively from the group axioms we have extracted fresh knowledge about groups in generalnamely that, the existence of a right inverse for each element of the group implies the existenceof a left-inverse. In fact, by replacing the opening line of the proof by




Let x and y be elements such that

we could claim that the following formula

φleft−right−inversedf= ∀x, y[(x + y = 0)→ (y + x = 0)]

is also a logical consequence of the group axioms.




ValidityDefinition 20.8 (Validity). Let A be a Σ-structure, φ and ψ beΣ-formulae.• (A φ). φ is valid in A if and only if A is a model of φ.• (A φ). φ is valid in a class A of Σ-structures if it is valid in

each structure A ∈ A.• ( φ). φ is (logically) valid if and only if every Σ-structure is a

model of φ.• (φ⇔ ψ). φ is (logically) equivalent to ψ if φ↔ ψ.




Validity of Sets of FormulaeDefinition 20.9 (Validity of a set of formulae). Let Φ be a set ofΣ-formulae.• (A Φ). Φ is valid in A if and only if A is a model of eachφ ∈ Φ.• (A Φ). Φ is valid in a class A of Σ-structures if A Φ for

each A ∈ A.• ( Φ). Φ is valid if and only if every Σ-structure is a model of

each φ ∈ Φ.




We have deviated in our meta-logical notation from standard textbooks. Notice from the definitionsof logical consequence, logical validity and validity of sets of formulae that Φ |= ψ if and only if forevery structure Σ-structure A, A Φ implies A ψ. Most textbooks on first-order logic actuallyuse only one overloaded symbol |= for both concepts. However we have decided to keep themseparate since the concept of logical consequence involves sets of formulae of a formal languagewhereas the other refers specifically to models.

Notice also that for any formula φ, both φ and |= φ denote that φ is (logically) valid. By extension,therefore logical equivalence defined as φ↔ ψ may equally well be defined as |= φ↔ ψ.




Negations of Semantical Concepts•We use the symbols 6 , 6|=, 6⇔ to denote the negations of the

corresponding relations.




Exercise 20.1

1. Prove that A is a model of φ iff A is a model of ~∀[φ].

2. Prove that Φ |= ψ iff ~∀[Φ] |= ~∀[φ], where ~∀[Φ] denotes the universal closure of each formula inΦ.

3. Prove that φ if and only if |= φ (i.e. ∅ |= φ). Hence in most books on logic, the same symbol|= is used for both logical consequence and for validity in models).

4. Prove that φ if and only if ~∀[φ].

5. Prove that φ is satisfiable in A if and only if ~∃[φ] is satisfiable in A.

6. Show that φ ∨ ¬φ is valid for any formula φ.

7. In general every first-order logic formula which has a tautological “shape” in propositionallogic is a valid formula. Formalize this notion and prove it.

8. Prove that ∀ and ∃ are “duals” i.e. show that

(a) ∀x[φ]↔ ¬∃x[¬φ]




(b) ∃x[φ]↔ ¬∀x[¬φ]

9. Let Qx[ψ] ∈ SF (φ) and y 6∈ FV (ψ). Let φ′ be obtained from φ by replacing Qx[ψ] by

Qy[y/xψ]. Then φ and φ′ are said to be α-equivalent and denoted by φ ≡α φ′. Prove that twoα-equivalent formulae are equivalent.

10. If x 6∈ FV (ψ) then Qx[ψ] is equvalent to ψ.

11. Give examples of interpretations (A, v) to show that the following formulae are not valid.

(a) ∃x[ψ]→ ∀x[ψ]

(b) ∀x∃y[ψ(x, y)]→ ∃y∀x[ψ(x, y)]

12. Prove that for any binary predicate ψ,

∀x, y[ψ(x, y)→ ψ(y, x)]⇔ ∀x, y[ψ(x, y)↔ ψ(y, x)]

13. Prove that the following formulae are valid.

(a) ∃x[ψ(x)→ ∀x[ψ(x)]].

(b) ∃x∀y[ψ(x, y)]→ ∀y∃x[ψ(x, y)]

14. Prove that φ⇔ ψ iff φ |= ψ and ψ |= φ.




15. A student claimed that the following definition is a stronger definition of logical consequencethan definition 20.6. Justify or refute his claim.

Definition 20.10 A Σ-formula ψ is a logical consequence’ of a set Φ of Σ-formulae, denotedΦ |=′ ψ, if and only if for every interpretation (A, v), (A, v) φ for each φ ∈ Φ implies(A, v) ψ.




20.1. Some Model Theory

When we consider the notion of a model, we come across various notions which express propertiesof the models. For example, if < denotes an irreflexive ordering relation, then a sentence such as∀x∃y[x < y] specifies that there is no greatest element in the ordering. Such a sentence therefore hasno finite models. Its negation however has only finite models. Another sentence such as ∀x∀y[x = y]

in the language of first-order logic with equality has only singleton models.

If all models of a set Φ of sentences are finite then there must be a finite bound on the size of themodels (i.e. the carrier set of the structure must be of a fixed finite cardinality). Otherwise, as thefollowing theorem shows there would also be infinite models. For instance as Enderton [3] states:

It is a priori conceivable that there might be some very subtle equation of group theory thatwas true in every finite group but false in every infinite group.

But theorem 20.11 assures us that such a possibility does not exist.

Theorem 20.11 If a set Φ of Σ-formulae has arbitrarily large finite models, then it has an infinitemodel.




Proof: Assume Φ has arbirarily large finite models. Now consider the following formulae ψk foreach k ≥ 2,

ψ2df= ∃x1, x2[¬(x1 = x2)]

ψ3df= ∃x1, x2, x3[¬(x1 = x2) ∧ ¬(x1 = x3) ∧ ¬(x2 = x3)]

... ... ...where each ψk states that there are at least k distinct elements in the model. Now consider the setΨ = Φ ∪ ψk | k ≥ 2. By hypothesis, every finite subset of Ψ has a model. By the compactnesstheorem therefore Ψ also has a model. However clearly no model of Ψ can be finite. Hence Ψ musthave an infinite model. This infinite model is also a model of Φ since Φ ⊂ Ψ. QED







21. Lecture 21: Structures and Substructures

Lecture 21: Structures andSubstructures

Wednesday 21 Septemeber 2011




1. Satisfiability and Expansions

2. Distinguishability

3. Evaluations under Different Structures

4. Isomorphic Structures

5. The Isomorphism Lemma

6. Substructures

7. Substructure Examples

8. Quantifier-free Formulae

9. Lemma on Quantifier-free Formulae

10. Universal and Existential Formulae

11. The Substructure Lemma




Satisfiability and ExpansionsOur notions of satisfiability, consequence and validity are allwith respect to a specific signature.Lemma 21.1 Let Σ1 ⊆ Σ2. For any set Φ of Σ1-formulae, Φ issatisfiable with respect to Σ1 iff Φ is satisfiable with respect toΣ2.Proof: Essentially the interpretations of the common symbolsshould coincide, whereas the interpretations of the symbols inΣ2 − Σ1 may be arbitrary. It then follows from problem 1 ofexercise 19.1. QED




DistinguishabilityExample 21.2 For Σ = = : s2, < : s2 consider the Σ-structures Z = 〈Z; ; =, <〉 and Q = 〈Q; ; =, <〉 where Z is theset of integers, Q is the set of rational numbers, < are respec-tively the “less-than” relations on the two sets respectively. Nowconsider the formula

φdensitydf= ∀x, z[x < z → ∃y[(x < y) ∧ (y < z)]

Clearly Q φdensity whereas Z 6 φdensity.φdensity distinguishes the two structures.




Evaluations under Different StructuresWhen evaluating terms and formulae under different structuressay A and B using valuations vA : V −→ A and vB : V −→ Bwhere A = |A| and B = |B| we use VA, TA and VB, TB re-spectively to distinguish the possibly different values and truthvalues.




Isomorphic StructuresExample 21.3 Let Σ = 0 :−→ s,+1 : s −→ s,= : s2, < : s2.Now consider the structures

N = 〈N; 0,+1; =, <〉2N = 〈2N; 0,+2; =, <〉

where 2N is the set of even natural numbers, +1, +2 denotethe respective “successor” functions, < is the usual “less-than”relation on both structures. The two structures are clearly iso-morphic and there is an isomorphism which π : N −→ 2N suchthat π(n) = 2n which along with the inverse map π−1(2n) = nmaps one structure exactly onto the other.

Isomorphic Σ-structures cannot be distinguished by P1(Σ).




The Isomorphism LemmaLemma 21.4 (The Isomorphism Lemma). If A and B are iso-morphic Σ-structures then for all formulae φ, A φ if and onlyif B φ.

2

Corollary 21.5 If π : A ∼= B then for any formula φ(x1, . . . , xn)and any valuation vA, vB and values a1, . . . , an ∈ |A|,

(A, vA[x1 := a1 . . . , xn := an]) φiff

(B, vB[x1 := π(a1) . . . , xn := π(an)]) φ




Proof of the Isomorphism Lemma 21.4

Proof: Assume there is an isomorphism π : A ∼= B. Since π is an isomorphism, π : A1-1−−→

ontoB

is a bijection that preserves structure, where A = |A| and B = |B|. Hence π−1 : B1-1−−→

ontoA also

exists and preserves strucure. Further we have

1. π(fA(a1, . . . , am)) = fB(π(a1), . . . , π(am)) for every m-ary function fA,

2. π−1(fB(b1, . . . , bm)) = fA(π−1(b1), . . . , π−1(bm)) for every m-ary function fB,

3. (a1, . . . , an) ∈ pA iff (π(a1), . . . , π(an)) ∈ pB for each n-ary relation pA.

4. (b1, . . . , bn) ∈ pB iff (π−1(b1), . . . , π−1(bn)) ∈ pA for each n-ary relation pB.

Further for each vA : V −→ A, π vA : V −→ B and for each vB : V −→ B, π−1 vB : V −→ A.

For every formula φ, we may prove the stronger claims,

TAJφKvA = TBJφKπvATBJφKvB = TAJφKπ−1vB

The proof is by induction on the structure of φ and is left as an exercise to the interested reader.QED




The Isomorphism Lemma then raises the following interesting question which we will answer later.

Question. Are Σ-structures which satisfy the same Σ-formulae isomorphic?

But regardless of the answer to the above question the isomorphism lemma also brings the realiza-tion that for a any given set of Σ-formulae there could be more than one model, in fact a class ofmodels, for a given nonempty set Φ of Σ-formulae. Our notions of validity may therefore includethe following besides those given in definition 20.8.




SubstructuresDefinition 21.6 A Σ-structure A is a substructure of another Σ-structure B (denoted A ⊆ B) if• ∅ 6= |A| ⊆ |B|,• for each f : sm −→ s ∈ Σ, fA = fB |A|m,• for each p : sn ∈ Σ, pA = pB ∩ |A|n.

where fB |A|m denotes the restriction of fB to elements of|A|m.

Facts 21.7 If A ⊆ B, then1. |A| is Σ-closed, i.e. for each f : sm −→ s ∈ Σ and each

(a1, . . . , am) ∈ |A|m, fB(a1, . . . , am) ∈ |A|.2. Conversely for each X ⊆ |B|, such that X is Σ-closed there

exists a unique Σ-substructure X ⊆ B.




Substructure ExamplesExample 21.81. N = 〈N; 0,+; =〉 is a substructure of Z = 〈Z; 0,+; =〉 which in

turn is a substructure of Q = 〈Q; 0,+; =〉 which in turn is asubstructure of R = 〈R; 0,+; =〉.

2. 2N = 〈2N; 0,+; =〉 is a substructure of N = 〈N; 0,+; =〉.3. However the odd numbers are not closed under addition and

hence they do not form a substructure of N under addition.




Quantifier-free FormulaeDefinition 21.9 For any signature Σ, the set QF1(Σ) ofquantifier-free formulae is given by the following grammar

χ, ξ ::= ⊥ | >| p(t1, . . . , tn) ∈ A(Σ) | (¬χ)| (χ ∧ ξ) | (χ ∨ ξ)| (χ→ ξ) | (χ↔ ξ)




Lemma on Quantifier-free FormulaeLemma 21.10 Let A ⊆ B be Σ-structures and let vA be anyvaluation in A. Then for every Σ-term t and every quantifier-free formula χ

VAJtKvA = VBJtKvATAJχKvA = TBJχKvA

Proof: Use the Isomorphism lemma 21.4 on the commonsubset of the two carrier sets using the identity isomorphism.QED




Universal and Existential FormulaeDefinition 21.11 For any signature Σ, the set Q

1(Σ) of Q-formulae for each Q∈ ∀,∃ is defined inductively by the fol-lowing grammar

φ, ψ ::= χ ∈ QF1(Σ)| (φ ∧ ψ) | (φ ∨ ψ)| Qx[φ]

∀1(Σ) is the set of universal Σ-formulae and ∃1(Σ) is the set ofexistential Σ-formulae.




The Substructure LemmaLemma 21.12 (The Substructure Lemma). Let A ⊆ B be Σ-structures and let φ(x1, . . . , xn) ∈ ∀1(Σ). Then for any valuationsvA, vB and a1, . . . , an ∈ A,

if (B, vB[x1 := a1, · · · , xn := an]) φthen (A, vA[x1 := a1, · · · , xn := an]) φ

Proof: By induction on the structure of universal formulae.QED




Exercise 21.1

1. Theorem 21.13 (The Homomorphism Theorem). Let $ : A −→ B be a homomorphism froma structure A into a structure B and let vA be a valuation.

(a) For any term t, VBJtK$vA = $(VAJtKvA)

(b) If $ is injective then for any quantifier-free formula χ, TBJχK$vA = $(TAJχKvA)

Prove theorem 21.13.

2. Prove that if $ is not surjective, the two structures may be distinguished by a formula.

3. In part 1b of theorem 21.13 why is the condition of injectivity necessary? (Hint. Let = be theatomic binary equality predicate whose semantics is defined in an obvious fashion. Now showthat if $ is not injective then the two structures can be distinguished using equality.)

4. Prove that if $ is surjective but not necessarily injective, then in the absence of any predicatefrom which equality or inequality of terms may be expressed or derived, part 1b of theorem 21.13may be extended to quantified formulae.







22. Lecture 22: Predicate Logic: Proof Theory

Lecture 22: Predicate Logic: ProofTheory

Friday 23 September 2011




1. Proof Theory: First-Order Logic

2. Proof Rules: Hilbert-Style

3. The Mortality of Socrates

4. The Mortality of the Greeks

5. Faulty Proof:2

6. A Correct Proof

7. The Sequent Forms

8. The Case of Equality

9. Semantics of Equality

10. Axioms for Equality

11. Symmetry and Transitivity

12. Symmetry of Equality

13. Transitivity of Equality




Proof Theory: First-Order Logic1. A first-order proof system for a mathematical theory with sig-

nature Σ consists of two partsLogical Axioms and Rules These are extensions of existing

propositional proof systems to deal with quantification.However, they are still parameterised on Σ since the useof terms and substitution is involved in them.

Non-logical axioms These are the axioms which define themodels to which they are applicable.

2. (First-order) Predicate Calculus: The first-order theory withno non-logical axioms.




Proof Rules: Hilbert-StyleDefinition 22.1 H1(Σ), the Hilbert-style proof system for Pred-icate logic consists of• The set L1(Σ) generated from A and ¬,→,∀• The three logical axiom schemas K, S and N,• The two axiom schemas

∀E.∀x[X ]→ t/xX

, t ≡ x or t/x admissible in X

∀D.∀x[X → Y ]→ (X → ∀x[Y ])

, x 6∈ FV (X)

• The modus ponens (MP) rule and

∀I. y/xX∀x[X ]

, y ≡ x or y 6∈ FV (X)




Admissibility of Substitutions

Some explanations are perhaps in order regarding the “side conditions” attached to the axioms above.

∀E. The side condition “t ≡ x” allows the subsituted term to be the variable x, which is not surpris-ing. But this is explicitly mentioned only because the side condition “t/x admissible in X”does not include the possibility of replacing x by itself.But more important is the side condition “t/x admissible in X”. If this condition were notpresent then we could have the following situation which would be clearly unsound.

Let p(x, y) be a binary predicate and let φ(x)df= ¬∀y[p(x, y)].

Now clearly y/x is not admissible in φ(x). If the condition of admissibility were absent thenwe could have replaced the free variable x with y yielding the following instance of axiomschema ∀E

∀x[¬∀y[p(x, y)]]→ ¬∀y[p(y, y)] (16)

Now consider any model consisting of at least two distinct elements and let p(x, y) denote theidentity relation on the carrier set. Clearly in such a model while the antecedent ∀x[¬∀y[p(x, y)]]

is clearly valid (true for each valuation), the consequent ¬∀y[p(y, y)] would be invalid (false for




certain valuations).

∀D. In this axiom schema if we were to relax the side-condition and allow the quantifier to be movedin even if x ∈ FV (X) it would result in the following situation when X = φ(x) = Y .

∀x[φ(x)→ φ(x)]→ (φ(x)→ ∀x[φ(x)]) (17)

Whereas the antecedent of the formula in (17) is logically valid, the truth of the consequent isnot guaranteed in any interpretation in which φ(x) is true only for some elements of the domainand false for others.

∀I. The hypothesis of the rule asserts that a formula φ(· · ·, x, · · ·) holds when the free variable x isreplaced uniformly throughout the formula by any variable that does not occur free in φ. Thenclearly the formula holds for any “arbitrary” value that x may take, and hence the variable x maybe universally quantified.




The Mortality of SocratesThe Mortality of Socrates

A translation into predicate logic involves• including two unary atomic predicates h and m denoting

properties of objects (human and mortal respectively),• including a constant s (for Socrates).

∀x[h(x)→ m(x)], h(s) ` m(s)

∀x[h(x)→ m(x)]∀E h(s)→ m(s) h(s)MP m(s)




The Mortality of the GreeksAll humans are mortal.All Greeks are human.Therefore all Greeks are mortal.

∀x[h(x)→ m(x)],∀x[g(x)→ h(x)] ` ∀x[g(x)→ m(x)]

A faulty proof:

∀x[g(x)→ h(x)] ∀x[h(x)→ m(x)]T⇒ ∀x[g(x)→ m(x)]




Faulty Proof:2∀x[g(x)→ h(x)]

∀E g(y)→ h(y)∀x[h(x)→ m(x)]

∀E h(z)→ m(z)?? g(y)→ m(y)

∀I ∀x[g(x)→ m(x)]




A Correct Proof∀x[g(x)→ h(x)]

∀E g(y)→ h(y)∀x[h(x)→ m(x)]

∀E h(y)→ m(y)T⇒ g(y)→ m(y)

∀I ∀x[g(x)→ m(x)]




The Sequent FormsThe sequent forms of K, S, N, MP are as before. Thesequent forms of the quantification rules are as follows:

∀E.Γ ` ∀x[X ]→ t/xX

, t ≡ x or t/x admissible in X

∀D.Γ ` ∀x[X → Y ]→ (X → ∀x[Y ])

, x 6∈ FV (X)

∀I. Γ ` y/xXΓ ` ∀x[X ]

, y 6∈ FV (X) ∪ FV (Γ)

Note that the variable y being quantified should not occur freein any of the assumptions Γ.




The Case of Equality1. In most algebraic formulations equality is a necessary binary

predicate of the signature.2. In other cases where equality may not otherwise play a

prominent role, it becomes necessary because of one ormore of the following reasons.

(a) Syntactically distinct terms may represent the same value ina structure either because of some axioms or because ofsome identifications made in a valuation i.e. it is possiblethat even though s 6≡ t, VAJsKvA = VAJtKvA. (see alsoexercise 21.1).

(b) Two differently named entities are proven to be the sameentity (generally in proofs of uniqueness or in proofs bycontradiction).




Generally in mathematics, if there are two named entities x and y and nothing is stated about therelationship between them, we can neither assume they stand for distinct entities nor can we excludethe possibility that they both stand for the same entity.

For instance, it becomes essential when speaking of models which contain at least two elements, tomake a first-order statement like ∃x∃y[¬(x = y)] which states that there exist at least two distinctobjects.

Example 22.2 Consider any first-order theory of boolean algebra. If we allow for the possibilitythat 1 = 0 every equation of boolean algebra would still be satsified and the boolean identitieswould still continue to hold. But in addition we would also have 0 ≤. 1 ≤. 0. If we were to adoptthis as a model for truth, then every formula would be implied by every other formula and the wholeedifice of mathmatical logic as we know it would collapse to a triviality without the assumption thatthe value 1 is distinct from the value 0.

On the other hand, it may so happen that one may define properties of objects and prove a theoremstating in effect that if the two objects x and y both satisfy a certain property φ, then they are boththe same object. This is usually expressed as (φ(x) ∧ φ(y))→ (x = y).

Example 22.3 The definition of injective functions in any standard mathematics text uses equality




or inequality in its definition. A function f : A −→ B is injective if for any two distinct elementsa, a′ such that a 6= a′, f (a) 6= f (a′). Alternatively, f is injective if f (a) = f (a′) implies a = a′.

The use of equality or inequality in all such situations is inescapable. Hence equality has a specialplace in mathematics and logic and is usually required as a basic relation between named entities,even when it is not primarily a relation of interest in the models that are being studied.




Semantics of EqualityThe semantics of the binary infix atomic predicate = is definedas follows:

TAJs = tKvAdf=

1 if VAJsKvA=VAJtKvA0 otherwise

In the sequel we do not explicitly include equality in the signa-ture, but assume that it is present as part of the language ofFirst-order Predicate Logic with Equality.(First-order Predicate Calculus with Equality) The first-ordertheory with no non-logical axioms except the axioms for equal-ity.




Axioms for EqualityEquality usually is a reflexive, symmetric, transitive and substi-tutive relation on structures. However the following axioms aresufficient.

= R.t = t

= C.(s = t)→ (s/xu = t/xu)

= S.(s = t)→ (s/xX → t/xX)

, s/x, t/x admissible in X




Explanations.

=R This is the reflexivity axiom schema, which asserts that any term equals itself.

=C This is the congruence axiom schema and it asserts that equals may be substituted for equals inany term context.

=S This is the substitutivity axiom schema which again asserts that the replacement of equals byequals does not alter the truth value of formulae. However due to the presence of quantifiers andbound variables one must ensure that the substitution of equals by equals does not result in thecapture of free variables of either s or t.




Symmetry and TransitivityThe rule of substitutivity (=S) is sufficiently powerful to force theproperties of symmetry and transitivity with the help of reflexiv-ity and modus ponens.

= Sym.Γ ` s=tΓ ` t=s

= T.

Γ ` s=tΓ ` t=uΓ ` s=u




Symmetry of EqualityProof of derived rule =Sym

Let Γ = s = t and Let φdf= x = s. Then we have

s/xφ ≡ s = st/xφ ≡ t = s

Γ ` s = t =S Γ ` s = t→ (s = s→ t = s)MP Γ ` s = s→ t = s =R Γ ` s = sMP Γ ` t = s




Transitivity of EqualityProof of derived rule =TLet ∆ = s = t, t = u and ψ ≡ s = x. Then

=S ∆ ` t = u→ (s = t→ s = u) ∆ ` t = uMP ∆ ` s = t→ s = u ∆ ` s = tMP ∆ ` s = u










23. Lecture 23: Predicate Logic: Proof Theory (Contd.)

Lecture 23: Predicate Logic: ProofTheory (Contd.)





1. Alpha Conversion

2. The Deduction Theorem for Predicate Calculus

3. Useful Corollaries

4. Soundness of Predicate Calculus

5. Soundness of The Hilbert System




Alpha ConversionNotation We use “φ a` ψ” as an abbreviation for the two state-ments “φ ` ψ” and “ψ ` φ”.

Lemma 23.1 For every formula φ for which y/x is admissible∀x[φ] a` ∀y[y/xφ]

Proof: If y/x is admissible in φ then we may readily see thatx/y is also admissible in y/xφ since x 6∈ FV (y/xφ) and

x/yy/xφ ≡ φ

Further since x 6∈ FV (∀x[φ]) we have by axiom schema ∀E∀x[φ]→ φ i.e.

∀x[φ]→ x/yy/xφWe then have the following proofs. QED




Proof trees of ∀x[φ] a` ∀y[y/xφ]

∀x[φ] ` ∀x[φ]∀E ∀x[φ] ` ∀x[φ]→ x/yy/xφ

MP ∀x[φ] ` x/yy/xφ∀I ∀x[φ] ` ∀y[y/xφ]

∀y[y/xφ] ` ∀y[y/xφ]∀E ∀y[y/xφ] ` ∀y[y/xφ]→ y/xφ

MP ∀y[y/xφ] ` y/xφ∀I ∀y[y/xφ] ` ∀x[φ]




The Deduction Theorem for PredicateCalculus

Theorem 23.2 (The Deduction Theorem for Predicate Calculus).Let Γ ⊆f L1 and φ, ψ ∈ L1.1. (DT⇐). If Γ ` φ→ ψ then Γ, φ ` ψ.2. (DT⇒). Let Γ, φ ` ψ.Then Γ ` φ→ ψ if no variable in FV (φ)

is generalized (by an application of ∀ I) in the proof.

2




Proof of the Deduction Theorem for Predicate Calculus (theorem 23.2)

Proof:

1. (DT⇐). The proof is identical to that of the corresponding propositional case.

2. (DT⇒). Assume Γ, φ ` ψ. Then there exists a proof tree T rooted at ψ with nodes ψ1, . . . , ψm ≡ψ. Then the stronger claim that each step of the proof of ψi can be matched by a proof of φ→ ψiis again proven here by induction on k = `(ψ1)− `(ψi). The proof proceeds in a manner similarto the corresponding propositional case for all applications of the propositional axioms and theinference rule (MP). We consider only the cases of quantification.If ψj for 0 < j ≤ m is an axiom (including an instance of ∀E or ∀D) or ψj ∈ Γ, then the prooftree T ′

j rooted at φ→ ψj is constructed from the proof tree Tj rooted at ψj as follows.

Tj j′

ψjj′ + 1

ψj → (φ→ ψj)j′ + 2

φ→ ψj

Suppose ψj was obtained by the application of the axiom schema ∀I on some ψi such that `(ψi) >`(ψj). Then ψj ≡ ∀x[ψi]. By the induction hypothesis there exists a proof tree T ′

i rooted at




φ→ ψi and such that no free free variable of φ has been generalized in the application. Furtherx 6∈ FV (φ). We may now extend T ′

i to a proof tree T ′j as follows.

T ′i

i′ φ→ ψii′ + 1 ∀x[φ→ ψi]

i′ + 2 ∀x[φ→ ψi]→ (φ→ ∀x[ψi])i′ + 3

φ→ ∀x[ψi] ≡ φ→ ψj

QED




Useful CorollariesCorollary 23.3 If the proof of Γ, φ ` ψ involves no generalizationof any free variable of φ then Γ ` φ→ ψ.

Corollary 23.4 If φ is a closed formula and Γ, φ ` ψ then Γ `φ→ ψ.

Corollary 23.5 If no free variable of Γ = φ1, . . . , φm is gener-alized in a proof of Γ ` ψ, then ` φ1→ · · · → φm→ ψ.




Soundness of Predicate CalculusProposition 23.6 Every wff of L1 which is an instance of a tau-tology of Propositional logic is a theorem of PC and may beobtained using only the axiom schemas K, S, N and the ruleMP.

Theorem 23.7 (Consistency of Th(PC)). The theory PC is con-sistent i.e. the set Th(PC) of theorems of PC form a consistentset.

2




Proof of theorem 23.7

Proof: Define the erasure of a formula e(φ) as the propositional formula obtained by deleting allterms and all quantifiers and retaining only the atomic predicate symbols and propositional connec-tives. The function e may be defined by induction on the structure of the formula φ.

• Claim 0. For each instance of the axioms of H1, the erasure of the formulae yields tautologies(of propositional logic)

• Claim 1. The erasure of each application of the rules of H1, preserves tautologous-ness, i.e. ifthe erasure of the premises of a rule are all propositional tautologies then so is the erasure of theconclusion.

• Claim 2. The erasure of every formula in Th(PC) is a tautology.

If Th(PC) were inconsistent then Th(PC) = L1. In particular, there exist formulae φ,¬φ ∈Th(PC). But then by the definition of erasure we have e(¬φ) ≡ ¬e(φ) which contradicts Claim 2.QED




Soundness of The Hilbert SystemTheorem 23.8 (Soundness of H1). Every theorem of H1 is logi-cally valid.

2





Proof: Here are some easily proven claims which hold for any interpretation of Σ-formulae andhence also hold for the predicate calculus.

• Claim 1 Every instance of a (propositional) tautology is true for every interpretation.

• Claim 2 All instances of the axioms ∀E and ∀D are logically valid.

• Claim 3 The rules MP and ∀I preserve logical validity i.e. if the premises of the rules are logicallyvalid formulae under every Σ-interpretation then so are the conclusions.

It then follows by induction on the structure of the proof tree that every theorem is logically valid.QED

Exercise 23.1

1. Prove the arguments in Problem 2 of exercise 17.1 using the system H1.

2. Prove the conclusions of Problem 2 in exercise 25.1 using only the proof rules of H1.

3. Let ψ ∈ SF (φ). If φ′ is obtained from φ by replacing zero or more occurrences of ψ by ψ′. Thenprove the following.




(a) ~∀[ψ ↔ ψ′] ` φ↔ φ′

(b) ψ ↔ ψ′ ` φ↔ φ′

(c) We also have the derived rule of inference ⇔ .

Γ ` ψ ↔ ψ′

Γ ` φΓ ` φ′

which is the only rule that

allows the use of a rule that applies to proper sub-formulae of a formula.

4. If φ and ψ are formulae such that x 6∈ FV (φ), then the following are theorems of H1 (∃x[φ] isan abbreviation of ¬∀x[¬φ]).

(a) ` φ→ ∀x[φ] and hence ` φ↔ ∀x[φ]

(b) ` ∃x[φ]→ φ and hence by rule ∃I ` ∃x[φ]↔ φ

(c) ` ∀x[φ→ ψ]↔ (φ→ ∀x[ψ])

(d) ` ∀x[ψ → φ]↔ (∃x[ψ]→ φ)







24. Lecture 24: Existential Quantification

Lecture 24: Existential QuantificationWednesday 28 September 2011




1. Existential Generalisation

2. Existential Elimination

3. Remarks on Existential Elimination

4. Restrictions on Existential Elimination

5. Equivalence of Proofs




Existential GeneralisationExistential quantification is the derived operator defined by

∃x[φ]df= ¬∀x[¬φ]

We then have the rules

∃I. Γ ` t/xφΓ ` ∃x[φ]

, t/x admissible in φ

∃E. Γ ` ∃x[φ]

Γ ` a/xφ, a 6∈ FV (Γ) ∪ FV (∃x[φ]) is fresh

• ∃I is a derived rule•However ∃E is not a derived rule in the Hilbert-System.




Proof of Derived rule ∃I

Assuming t/x is admissible in φ we have

DNIΓ ` t/xφ→ ¬¬t/xφ

∀EΓ ` ∀x[¬φ]→ ¬t/xφ N”

Γ ` (∀x[¬φ]→ ¬t/xφ)→ (¬¬t/xφ→ ¬∀x[¬φ])MP

Γ ` ¬¬t/xφ→ ¬∀x[¬φ]T→

Γ ` t/xφ→ ¬∀x[¬φ]

Assuming Γ ` t/xφ, the last step gives

Γ ` t/xφ→ ¬∀x[¬φ] Γ ` t/xφMP

Γ ` ¬∀x[¬φ] ≡ ∃x[φ]




Existential EliminationExistential generalisation from a constant introduced somehowinto the proof is very common in mathematics.Example 24.1 Consider a purported proof of

∃x[φ→ ψ],∀x[φ] ` ∃x[ψ]

where x may be a free variable of both φ and ψ. The standardpractice is to assume the existence of some “constant symbol”a and proceed with it to eventually generalize.




Let Γ = ∃x[φ→ ψ],∀x[φ].

A Γ `∃E ∃x[ψ] proof is as follows.

A proof using ∃E

Γ ` ∃x[φ→ ψ]∃E

Γ ` a/x(φ→ ψ) ≡ a/xφ→ a/xψΓ ` ∀x[φ]

∀EΓ ` a/xφ

MPΓ ` a/xψ

∃IΓ ` ∃x[ψ]

Unless the same “constant symbol” is used both in the application of ∃E and ∀E the proof willnot go through. Here is a proof not involving the use of the constant (which is reminiscent of aproof by contradiction) which assumes there is no value for x which will make ψ true. Let ∆ =

∀x[φ],∀x[¬ψ]




∆ ` ∀x[φ]∀E ∆ ` φ

∆ ` ∀x[¬ψ]∀E ∆ ` ¬ψ∧I (¬(φ→ ψ) ≡ φ ∧ ¬ψ)

∆ ` ¬(φ→ ψ)∀I

∆ ` ∀x[¬(φ→ ψ)]DT⇒ ∀x[φ] ` ∀x[¬ψ]→ ∀x[¬(φ→ ψ)]

Note that the application of DT⇒ is correct since no free variable of ∆ has been generalised in theproof so far. We may now proceed as follows. First from rule N’ we get

N’ ∀x[φ] ` (∀x[¬ψ]→ ∀x[¬(φ→ ψ)])→ (¬∀x[¬(φ→ ψ)]→ ¬∀x[¬ψ])

An application of MP then yields

∀x[φ] ` ¬∀x[¬(φ→ ψ)]→ ¬∀x[¬ψ]DT⇐ ∀x[φ],¬∀x[¬(φ→ ψ)] ≡ ∃x[φ→ ψ] ` ¬∀x[¬ψ] ≡ ∃x[ψ]




Remarks on Existential Elimination1. ∃E is not a derived rule.2. Proofs which utilize ∃E are denoted `∃E.3. There are some restrictions on the use of rules in `∃E proofs.4. Proofs involving existential formulae which utilize ∃E are

likely to be more direct and intuitively simpler than proofswhich avoid all uses of ∃E even for existential formulae.




Restrictions on Existential EliminationDefinition 24.2 A proof `∃E is correct provided1. Each application of ∃E should use a “fresh constant” symbol

not used previously in the proof.2. If constant symbol a (earlier introduced by an application

of ∃E to a formula ∃y[ψ]) appears in a formula a/xφ in aproof, then ∀z[a/xφ] cannot be deduced for any variablez ∈ FV (∃y[ψ]) ∩ FV (a/xφ) (by applying ∀I to a/xφ).

3. A formula a/xφ, where a is a constant symbol and x ∈FV (φ) can only be generalised to ∃x[φ].




Equivalence of ProofsTheorem 24.3 (Existential-Elimination Elimination Theorem).If Γ `∃E φ is a correct proof then Γ ` φ provided no constantsintroduced in the proof Γ `∃E φ occur in φ. i.e. if φ is provablefrom Γ by use of the ∃E rule then φ is provable from Γ withoutmaking use of the ∃E rule.

2However this theorem is not applicable if φ does contain any ofthe constants introduced by the proof Γ `∃E φ.





Proof: Let T rooted at Γ `∃E φ be a correct proof which involves one or more applications of rule

∃E. Assume there are k applications of rule ∃E in T Γ `∃E φ

.

Claim 1. For each i, 1 ≤ i ≤ k there exist proof trees

T ′i

Γ, a1/y1ψ1, · · · , ai−1/yi−1ψi−1 ` ∃yi[ψi]which are completely free from any application of rule ∃E and

T i Γ, a1/y1ψ1, · · · , ai/yiψi `∃E φ

which have (k − i) applications of rule ∃E.

Proof of claim 1. Starting from the highest level of T there exists a subtree

T1 Γ ` ∃y1[ψ1]

∃EΓ `∃E a1/y1ψ1




such that the proof T1

Γ ` ∃y1[ψ1]does not involve any application of rule ∃E. For i = 1, this is also

the required tree T ′

1 Γ ` ∃y1[ψ1]

.

By adding the formula a1/y1ψ1 to the set of assumptions Γ and removing the subtree T1

Γ ` ∃y1[ψ1]

from T we get a proof tree T 1

Γ, a1/y1ψ1 `∃E φin which there are only k − 1 applications of rule

∃E and Γ, a1/y1ψ1 `∃E a1/y1ψ1 is a leaf node of T 1.

Starting with T 1 we again remove the next application of rule ∃E viz.

T2 Γ, a1/y1ψ1 ` ∃y2[ψ2]

∃EΓ, a1/y1ψ1 `∃E a2/y2ψ2

to obtain T 2

Γ, a1/y1ψ1, a2/y2ψ2 `∃E φ




It is again clear that T2

Γ, a1/y1ψ1 ` ∃y2[ψ2]

does not involve any application of rule ∃E and is the required tree

T ′2

Γ, a1/y1ψ1 ` ∃y2[ψ2]

Proceeding in this fashion we get proof trees

T i Γ, a1/y1ψ1, · · · ai/yiψi `∃E φ

in which there are exactly k − i applications of rule ∃E and for i = k we get

T k Γ, a1/y1ψ1, · · · ak/ykψk ` φ

which is completely free from all applications of rule ∃E.

Further it is also clear that for each i, 1 ≤ i ≤ k there exist proof trees

T ′i

Γ, a1/y1ψ1, · · · ai−1/yi−1ψi−1 ` ∃yi[ψi]




which are also completely free from any application of rule ∃E.

End of proof of claim 1

By part 2 of definition 24.2 no variable z ∈⋃

1≤i≤k FV (∃yi[ψi]) is generalized anywhere in theproof of Γ `∃E φ. Hence the conditions for the application of DT⇒ hold.

By DT⇒ we get

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` ak/ykψk → φ (18)

Since the original proof is correct by definition 24.2, there is no occurrence of any of the constantsai, 1 ≤ i ≤ k, in φ. Further the proof (18) does not utilise the constant ak as anything more thana symbol. It also does not generalise on ak anywhere within the proof. Clearly then replacing thisconstant ak by a “fresh” variable symbol zk (which occurs nowhere in any of the proofs includingproof (18)) does not affect the correctness of the proof.

Take a fresh variable zk which does not occur anywhere in the proof (18) and replace all occurrencesof ak by zk to obtain proof (19).

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` zk/ykψk → φ (19)




Proof (19) may now be extended as follows.

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` zk/ykψk → φ∀I

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` ∀zk[zk/ykψk → φ]

which is again a correct proof.

Applying exercise 23.1.4d to the last step gives us

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` ∃yk[ψk]→ φ

By claim 1 we know there exists a proof of

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` ∃yk[ψk]

which is completely free of any application of rule ∃E. By applying rule MP we therefore get

Γ, a1/y1ψ1, . . . , ak−1/yk−1ψk−1 ` φ

By a similar process we may eliminate each of the constants ak−1 down to a1 to eventually obtain aproof Γ ` φ.

QED




Exercise 24.1

1. Use the method outlined in the proof of theorem 24.3 to transform the proof Γ `∃E ∃x[ψ] ofexample 24.1 to one without the use of rule ∃E.







25. Lecture 25: Normal Forms

Lecture 25: Normal FormsFriday 30 September 2011




1. Natural Deduction: 6

2. Moving Quantifiers

3. Quantifier Movement

4. More on Quantifier Movement

5. Prenex Normal Forms

6. The Prenex Normal Form Theorem

7. Prenex Conjunctive Normal Form

8. The Herbrand Algebra

9. Terms in a Herbrand Algebra

10. Herbrand Interpretations

11. Herbrand Models

12. Ground Quantifier-free Formulae




Natural Deduction: 6The introduction and elimination rules for the propositional op-erators along with the rules ∀I, ∀E, ∃I and ∃E comprise the sys-tem G1.

Introduction Elimination

∀ ∀I. Γ ` t/xXΓ ` ∀x[X ]

∀E. Γ ` ∀x[X ]

Γ ` t/xXt/x admissible in X

∃ ∃I. Γ ` t/xXΓ ` ∀x[X ]

∃E. Γ ` ∃x[φ]

Γ ` a/xφa 6∈ FV (Γ) ∪ FV (∃x[φ]) is fresh




Exercise 25.1

1. Prove the arguments in Problem 2 of exercise 17.1 using Natural Deduction.

2. There have been frequent complaints that Logic (of any order) is cold-blooded of the first order.Let’s dispel this notion. Consider the following premises.

All the world loves a lover. Romeo loves Juliet.

Now prove the following conclusions using Natural Deduction.

(a) Therefore I love you.

(b) Therefore Love loves Love.11

(c) Therefore if I love you, then you love me.

(d) Therefore you love yourself.

(e) Therefore everyone loves everyone.11This is of course a dirty trick!




3. Refer to the premises in Problem 2 above. Which of the conclusions becomes invalid if thepremise Romeo loves Juliet is removed? Further, does it follow that love is an equivalencerelation?




Moving QuantifiersNotation.1.−→Qx denotes a sequence of quantifiers

Q1x1

Q2x2 . . .

Qmxm

where m ≥ 0 and each Q

i ∈ ∀,∃, for 1 ≤ i ≤ m.2. For any quantifier Q∈ ∀,∃, Qdenotes its dual That is, if

Q= ∀, then Q= ∃ and if Q= ∃ then Q= ∀.




Quantifier MovementLemma 25.1 Let z 6∈ FV (φ) ∪ FV (ψ) ∪ x1, . . . , xn. Then thefollowing logical equivalences hold for Q′ ∈ ∀,∃.

1.−→Qx¬ Q′y[φ] ⇔

−→Qx Q′y[¬φ]

2.−→Qx[ Q′y[φ] ∨ ψ] ⇔

−→Qx Q′z[z/yφ ∨ ψ]

3.−→Qx[φ ∨ Q′y[ψ]] ⇔

−→Qx Q′z[φ ∨ z/yψ]




More on Quantifier MovementWe may use the above lemma to obtain prenexing rules for thepropositional connectives ∧ and → as well, as shown in thefollowing corollary. However, note the change of quantifier thatmarks the transformation of→ in the last equivalence.Corollary 25.2

4.−→Qx[ Q′y[φ] ∧ ψ] ⇔

−→Qx Q′z[z/yφ ∧ ψ]

5.−→Qx[φ ∧ Q′y[ψ]] ⇔

−→Qx Q′z[φ ∧ z/yψ]6.−→Qx[φ→ Q′y[ψ]] ⇔

−→Qx Q′z[φ→ z/yψ]7.−→Qx[ Q′y[φ]→ ψ] ⇔

−→Qx Q′z[z/yφ→ ψ]




Exercise 25.2

1. Prove lemma 25.1.

2. Prove corollary 25.2.

Having obtained the above results we are now ready to prove the Prenex normal form theorem. But first let’s define theform precisely.




Prenex Normal FormsDefinition 25.3 The set PNF1(Σ) of prenex normal forms isgenerated by the grammar

φ, ψ ::= χ ∈ QF1(Σ) | ∀x[ψ] | ∃x[ψ]

A formula is in prenex normal form if all the quantifiers occur-ring in the formula appear as a prefix

−→Qx (called the prenex)of a body that is “quantifier-free” and consists only of atomicpredicates with propositional connectives.




The Prenex Normal Form TheoremTheorem 25.4 (Prenex Normal Forms). For any formula φ thereexists a logically equivalent formula ψ in prenex normal form(PNF).

2




Proof of theorem 25.4Proof: Given a formula φ we go through the following steps.

1. Replace all subformulas of the form θ ↔ χ by subformulas

(θ → χ) ∧ (χ→ θ)

respectively to yield a new formula φ′ which is free of all occurrences of the connective↔.

2. Use α-conversion to obtain unique names for all bound and free variables12.

3. Now proceed by induction on the structure of φ′ by systematically applying the results obtainedfrom lemma 25.1 and corollary 25.2. This would yield a formula ψ in prenex normal form.

QED 12That is, ensure that no two quantifiers use the same bound variable and no variable occurs both free and bound in the formula.




Prenex Conjunctive Normal FormGiven a formula in prenex normal form, its body consists en-tirely of propositional connectives atomic predicates. By theo-rem 5.10 every propositional form may be converted into CNF.We may apply the same method to the body of a formula inPNF to obtain a Prenex Conjunctive Normal Form (PCNF). Sowe haveCorollary 25.5 (PCNF). For any formula φ there exists a logi-cally equivalent formula ψ in prenex conjunctive normal form(PCNF).




The Herbrand AlgebraDefinition 25.6 Let Σ be a signature containing at least oneconstant symbol a. A term t ∈ T(Σ) is said to be ground ifV ar(t) = ∅. T0(Σ) ⊆ T(Σ) is the set of ground terms (also calledthe Herbrand Universe). A literal p(t1, . . . , tn) or ¬p(t1, . . . , tn)containing no variables is called a ground literal.Definition 25.7 A Σ-algebra H(Σ) where Σ has at least oneconstant symbol, is called a Herbrand algebra iff |H(Σ)| =T0(Σ).




Terms in a Herbrand AlgebraIn a Herbrand algebra H(Σ)

• every function symbol represents itself. That is for each f :sm −→ s ∈ Σ, fH(Σ) = f

• a valuation is simply a function vH(Σ) : V −→ T0(Σ)

•However the predicate symbols require to be associated withrelations on ground terms in some way.




Herbrand InterpretationsLemma 25.8 Given a Herbrand interpretation (H, vH) where foreach variable x, v(x) = sx ∈ T0(Σ). For any term t with V ar(t) =x1, . . . , xk

VHJtKvH = sx1/x1, . . . , sxk/xkt

That is every valuation defines a substitution of variables byground terms.

Here valuations represent merely substitutions on terms.




Herbrand ModelsDefinition 25.9 A Herbrand model of a set Φ of Σ-formulae ismerely a valuation vH such that every formula in Φ is true underthe substitution defined by vH FV (Φ).




Ground Quantifier-free FormulaeTheorem 25.10 Let Σ be a signature containing at least oneconstant and let Λ = λ1, . . . , λk be a nonempty set of groundliterals. Then1.

∧1≤i≤k

λi has a model iff Λ does not contain a complementary

pair.

2.∧

1≤i≤kλi is never logically valid

3.∨

1≤i≤kλi always has a model

4.∨

1≤i≤kλi is logically valid iff it has a complementary pair.

2





Proof:

1. Clearly if Λ contains a complementary pair it does not have a model. Conversely assume it doesnot contain a complementary pair. We may define a Herbrand algebra HΛ as follows: For eachatomic predicate symbol p : sn define pHΛ

= (t1, . . . , tn) ∈ T0(Σ) | p(t1, . . . , tn) ∈ ΛClearly HΛ λi for each λi ∈ Λ since if λi ≡ p(t1, . . . , tn) and then p(t1, . . . , tn) ∈ Λ and(t1, . . . , tn) ∈ pHΛ

. On the other hand if λi ≡ ¬p(t1, . . . , tn) then p(t1, . . . , tn) 6∈ Λ and hence(t1, . . . , tn) 6∈ pHΛ

otherwise it would contradict the assumption that Λ contains no complemen-tary pair. Hence HΛ λi for each λi. Hence

∧1≤i≤k

λi has a model.

2.∧

1≤i≤k

λi cannot be valid since from the previous part we know that λi has a model, where λi is

the complement of λi.

3.∨

1≤i≤k

λi has a model because λi has a model.




4.∨

1≤i≤k

λi is valid iff∧

1≤i≤k

λi has no model iff Λ = λi | 1 ≤ i ≤ k contains a complementary

pair iff Λ contains a complementary pair.

QED







26. Lecture 26: Skolemization

Lecture 26: SkolemizationTuesday 04 Oct 2011




1. Skolemization

2. Skolem Normal Forms

3. SCNF

4. Ground Instance

5. Herbrand’s Theorem

6. The Herbrand Tree of Interpretations

7. Compactness of Sets of Ground Formulae

8. Compactness of Closed Formulae

9. The Lowenheim-Skolem Theorem




SkolemizationTheorem 26.1 (Skolem Normal Form Theorem.) Let φ ≡−→∀x∃y[ψ] where x = x1, . . . , xn and y are all distinct variablesand ψ does not contain any occurrence of any of the quanti-fiers Qxi. Let Σg = Σ ∪ g : sn −→ s be an expansion of thesignature Σ. Then

1. every model of φ′ ≡−→∀x[g(x1, . . . , xn)/yψ] is a model of φ.

2. every model of φ can be expanded to a model of φ′.

2

Corollary 26.2 Let φ and φ′ be as in theorem 26.1. Then1. there exists a model of φ iff there exists a model of φ′.2. φ is unsatisfiable iff φ′ is unsatisfiable.





Proof: We first of all note that g(x1, . . . , xn)/y is admissible in ψ.

1. We have|= φ′ → φ

and hence for every Σg-model B φ′, B φ holds too.

2. Conversely for every Σ-structure A, such that A φ, for each (a1, . . . , an) ∈ An there exists atleast one element a ∈ |A| such that

T JψKv[x1:=a1,...,xn:=an][y:=a] = 1

Define a function g which for each n-tuple (a1, . . . , an) ∈ |An| provides a single element a ∈ |A|such that

T JψKv[x1:=a1,...,xn:=an][y:=a] = 1

Then clearly for every (a1, . . . , an) ∈ |An| we have

T JψKv[x1:=a1,...,xn:=an][y:=gA(a1,...,an)] = 1

Now let AB where B is a Σg-algebra with gB = g. It is then clear that for every valuation vB,

T Jg(x1, . . . , xn)/yψKvB = 1




and hence B φ′.

QED




Skolem Normal FormsDefinition 26.31. The set SNF1(Σ) of Skolem normal forms is the set of univer-

sal closures of quantifier-free formulae. The quantifier-freeformula is called the body of the SNF.

2. A formula is in Skolem conjunctive normal form (SCNF) if itis a SNF whose body is in CNF. SCNF1(Σ) is the set of Σ-formulae in SCNF.




SCNFTheorem 26.4 For every sentence (closed formula) φ ∈ P1(Σ)there is an algorithm sko to construct a closed universal formulaψ ∈ SCNF1(Σ) such that φ has a model iff ψ has a model.

2

Definition 26.51. The function g in theorem 26.1 is called a Skolem function2. The process of constructing the function g in theorem 26.1 is

called Skolemization.3. φ and sko(φ) ≡ ψ are said to be equi-satisfiable.





Proof: The procedure may be briefly outlined as follows.

1. Clearly if φ is a closed formula, by theorem 25.5 we can construct a logically equivalent formulaφ0 ∈ PCNF1(Σ).

2. If there is no existential quantifier φ0 then the formula φ0 is in SCNF. Otherwise Skolemize theleft-most occurrence of an existential quantifer to obtain a formula φ1.

3. φ1 has one existential quantifier less than φ0. Perform step 2 on φ1.

Each execution of step 2 of the above procedure results in a formula which satisfies the conclusionsof theorem 26.1. QED

Exercise 26.1

1. Prove that |= φ′ → φ in theorem 26.1.

2. Prove that ` φ′ → φ in theorem 26.1.

3. Skolemization does not produce a SNF that is unique upto logical equivalence. Construct an




example of a formula φ which has two (or even more) different SNFS, φ′, φ′′ such that

φ 6⇔ φ′ 6⇔ φ′′ 6⇔ φ




Ground InstanceDefinition 26.6 Let Σ be a signature containing at leastone constant and let Φ be a nonempty set of closeduniversal Σ-formulae. For any φ ≡ ~∀[χ] where χ ∈QF1(Σ), the ground-instances of φ denoted g(φ) is the sett1/x1, · · · , tn/xnχ | FV (χ) = x1, . . . , xn, t1, . . . , tn ∈ T0(Σ)and g(Φ) =

⋃φ∈Φ

g(φ).




Herbrand’s TheoremTheorem 26.7 (Herbrand’s Theorem). Let Σ and Φ be as indefinition 26.6. Then the following statements are equivalent.1. Φ has a model.2. Φ has a Herbrand model.3. g(Φ) has a model.4. g(Φ) has a Herbrand model.

2





Proof: Clearly the following implications are trivial.

Statement 2: ”Φ has a Herbrand model” ⇒ Statement 1: ”Φ has a model”⇓ ⇓

Statement 4: ”g(Φ) has a Herbrand model” ⇒ Statement 3: ”g(Φ) has a model”

It suffices therefore to prove only the following claim.

Claim. Statement 3⇒ Statement 2.

Proof of claim. Let A g(Φ). We define a Herbrand interpretation H as follows. For each p : sn ∈Σ let

pH = (t1, . . . , tn) ∈ T0(Σ) | A p(t1, . . . , tn)In particular if p is an atomic proposition then pH = pA. With this construction exactly the sameatomic formulae are valid in both A and H. This result may be extended by structural inductionto arbitrary universally closed quantifier-free formulae. It is then easy to see that if A g(Φ) thenH Φ. QED




The Herbrand Tree of InterpretationsLet Σ be a signature containing at least one constant symbol.Let P0, P1, P2, . . . be an enumeration of all the ground atomic for-mulae of P1(Σ). The Herbrand Tree of interpretations is the infi-nite tree shown schematically below. Each infinite path (calleda Herbrand Base) of the tree represents a Herbrand interpreta-tion.




P1¬P1 ¬P1

¬P2¬P2 ¬P2P2 ¬P2 P2

P2

P0 ¬P0

P1

P2

Figure 8: The Herbrand tree of interpretations




Compactness of Sets of GroundFormulae

Lemma 26.8 (Compactness of a set of closed quantifier-free for-mulae). Let Θ be a (finite or infinite) set of ground quantifier-freeformulae. Then Θ has a model iff every finite subset of Θ has amodel.

2




Proof of lemma 26.8

Proof: (⇒) Clearly if Θ has a model then every finite subset of Θ also has a model.

(⇐) Assume every finite subset of Θ has a model but Θ itself does not have a model. By theorem26.7 each finite subset of Θ has a Herbrand model. We identify each path π in the Herbrand treewith a valuation vπ. Then since Θ does not have a model, it does not have a Herbrand model. Hencefor every path π there exists a formula χπ ∈ Θ such that (H, vπ) 6 χπ. In fact there exists a finitepoint `χπ in each path π at which (H, vπ) 6 χπ since χπ is made up of only a finite number of groundatoms.

Claim. χπ | (H, vπ) 6 χπ is a finite set.

Proof of claim.

Consider the tree TH obtained from the Herbrand tree such that from each path π the subtree rootedat `χπ + 1 has been removed. Hence TH is a finitely branching tree with only finite-length paths.Hence by (the contra-positive of) Konig’s lemma (lemma 2.17: any finitely-branching tree with onlyfinite-length paths must be finite) TH must be a finite tree where each path π′ is an initial segment ofan infinite path π from the Herbrand tree of interpretations. The leaf-nodes of each of these paths π′

determines a formula χπ that is not satisfied. Clearly then the set consisting of these formulae viz.




χπ | (H, vπ) 6 χπ is then a finite set.

End of proof of claim.

It is clear that the finite set χπ | (H, vπ) 6 χπ is a finite subset of Θ that does not possess aHerbrand model, contradicting the assumption that all finite subsets of Θ possess a Herbrand model.QED




Compactness of Closed FormulaeTheorem 26.9 (Compactness of closed formulae) A set Φ ofclosed Σ-formulae has a model iff every finite subset of Φ hasa model.

2

Corollary 26.10 (Finite Unsatisfiability). A set Φ of closed Σ-formulae is unsatisfiable iff there is a non-empty finite unsatis-fiable subset of Φ.





Proof: (⇒) is trivial.

(⇐) Assume Φ does not possess a model but every finite subset of Φ has a model. Transform eachformula into SNF. Since Φ has no model sko(Φ) = sko(φ) | φ ∈ Φ has no model either (bytheorem 26.4). By Herbrand’s theorem (theorem 26.7) the set g(sko(Φ)) also does not possess amodel. By lemma 26.8 we can find a finite subset of g(sko(Φ)) which does not have a Herbrandmodel. This finite set is a subset of a finite subset of Φ that does not possess a model. Hence there isa finite subset of Φ which does not have model, contradicting the assumption that every finite subsetof Φ has a model. QED




The Lowenheim-Skolem TheoremTheorem 26.11 (The Lowenheim-Skolem Theorem). If a set Φof closed formulae has a model, then it has a model with adomain which is at most countable.

Proof: Assume Φ has a model. Then sko(Φ) has a modeltoo. By theorem 26.7 sko(Φ) has a Herbrand model. Since aHerbrand model has a domain which is at most countable andsince every model of sko(Φ) is also a model of Φ, it follows thatΦ has a model with at most a countable domain. QED










27. Lecture 27: Substitutions and Instantiations

Lecture 27: Substitutions andInstantiations

Tuesday 11 October 2011




1. Substitutions Revisited

2. Some Simple Facts

3. Ground Substitutions

4. Composition of Substitutions

5. Substitutions: A Monoid




Substitutions RevisitedWe have defined substitutions and instantiations earlier. In lightof the Lowenheim-Skolem theorem 26.11, we1. require more powerful operations on syntactic substitutions

to exploit the construction of Herbrand models,2. need to extend the theory of substitutions to include a com-

position operator for substitutions.3. need to give a programing interpretation to First-order logic.




Some Simple FactsDefinition 27.11. 1 is the empty or identity substitution which replaces no vari-

able when applied to any term and dom(1) = ∅ = ran(1).2. SΩ(V ) is the class of all substitutions over a set of variablesV for a given signature Ω.

Fact 27.2 Let θ be a substitution and t a term. Then1. depth(θt) ≥ depth(t)

2. size(θt) ≥ size(t).




Ground SubstitutionsDefinition 27.3• θ = ti/xi | ti 6≡ xi, 1 ≤ i ≤ n is a ground substitution if eachti, 1 ≤ i ≤ n, is a ground term.• A term u is called an instance of a term t if there exists a

substitution θ such that u ≡ θt.• u is a ground instance of t if u is an instance of t and is ground.• u is a common instance of two or more terms t1, . . . , tn if there

exist substitutions θ1, . . . , θn such that

u ≡ θ1t1 ≡ · · · ≡ θntn

• Terms t and u are called variants of each other if there existsubstitutions θ and τ such that θt ≡ u and τu = t.




Composing Substitutions

We will often require to perform substitutions in sequence i.e. it may be necessary to first apply a substitution θ on aterm t yielding a term θt to which another substitution τ may be applied to yield a term τ(θt). We would like to answerthe question of how to define a single substitution ρ such that for every term u,

τ(θu) ≡ ρu (20)

Then ρ is the composition of τ with θ. Before presenting the formal definition of composition we try to understandhow such a composition must be defined to ensure that equation (20) holds. Let θ = s1/x1, · · · , sk/xk and τ =t1/y1, · · · , tm/ym. We have dom(θ) = X = x1, · · · , xk and dom(τ) = Y = y1, · · · , ym. The effect of θ onany term u is to replace each free occurrence of each variable xi by the term si simultaneously for 1 ≤ i ≤ k. Theterms si could contain (free) variables drawn from X and Y . It could also happen that some of the terms si may simplybe variables themselves. Consider a single variable xi. If si ≡ z for some variable z, then θu would simply have zoccurring free in all those positions of u where xi occurs free. Of course, free occurrences of xi could be present in θubecause of some other variable substitution (say si′/xi′ for some i′ 6= i). Hence it is clear that all free occurrences of anyx ∈ X in θt are due to the application of the substitution θ. Further, for any yj ∈ Y , we have the following possibilities.

1. Case yj ∈ Y −X and yj ∈ FV (u). All such free occurrences of yj in u will be present in the same positions in θuas well. The effect of τ would be to replace them all with tj.

2. Case yj ∈ Y −X and yj 6∈ FV (u). New free occurrences may arise due to the substitution θ. The effect of theapplication of τ will replace all of them by tj.




3. Case yj ≡ xi for some xi ∈ X . In this case the only free occurrences of yj possible are those which occur afterapplying θ.

To summarise

1. Case 1 requires τ to be applied separately.

2. The effect of τ on cases 2 and 3 may be captured by applying τ to the range of θ. Once that is done one may evenremove the element tj/yj from the substitution, since it would have no effect.

3. Further all elements such that τsi ≡ xi are removed from ρ, since we are interested in specifying the substitution asa finite set of non-identical replacements.

With this understanding we are ready to tackle our definition of composition.




Composition of SubstitutionsDefinition 27.4 Given substitutions θ = s1/x1, · · · , sk/xk andτ = t1/y1, · · · , tm/ym, their composition τ θ is a new substi-tution ρ such that

ρ = τsi/xi | 1 ≤ i ≤ k, τsi 6≡ xi∪tj/yj | 1 ≤ j ≤ m, yj 6∈ dom(θ)




Substitutions: A MonoidLemma 27.5 Given substitutions θ, τ , ρ and a term t, we have1. θ 1 = 1 θ = θ

2. (τ θ)t ≡ τ (θt)

3. ρ (τ θ) = (ρ τ ) θ2




Proof of lemma 27.5

Proof: We assume θ = s1/x1, · · · , sk/xk and τ = t1/y1, · · · , tm/ym and ρ = τ θ as indefinition 27.4. Then

1. Trivial.

2. We prove this by induction on the structure of terms. The case of constants is trivial. Theinduction case will also follow once the cases of simple variables has been proven. So wesimply prove this case for simple variables. For any variable x we have the following cases.Case x 6∈ dom(θ). Then clearly (θx) ≡ x and τ (θx) ≡ τx. Since the first component of theunion in the definition of ρ does not apply, we have ρx ≡ τx.Case x ≡ xi ∈ dom(θ) for some i, 1 ≤ i ≤ m. In this case ρxi ≡ τsi and since θxi ≡ si we haveτ (θxi) ≡ τsi.

3. For any term t we have from the previous proof

(ρ (τ θ))t ≡ ρ((τ θ)t) ≡ ρ(τ (θt)) ≡ (ρ τ )(θt) ≡ ((ρ τ ) θ)t

QED

Exercise 27.1




1. Let u ≡ θt. Give examples of t, u and θ such that FV (t) 6= ∅, u is ground but θ is not a groundsubstitution.

2. Prove that for any substitutions θ and τ , τ θ = τ ∪ θ iff dom(θ) ∩ dom(τ ) = ∅ and dom(τ ) ∩⋃t∈ran(θ) FV (t) = ∅

3. A substitution θ is called idempotent if θ θ = θ. Now complete the statement of the followinglemma and prove it.

Lemma 27.6 A substitution θ = ti/xi | 1 ≤ i ≤ m for some m ≥ 0 is idempotent iffdom(θ) · · · .







28. Lecture 28: Unification

Lecture 28: UnificationWednesday 12 October 2011




1. Unifiability

2. Unification Examples:1

3. Unification Examples:2

4. Generality of Unifiers

5. Generality: Facts

6. Most General Unifiers

7. More on Positions

8. Disagreement Set

9. Example: Disagreement 1

10. Example: Occurs Check



13. Disagreement and Unifiability

14. The Unification Theorem




Syntactic unification is the problem of finding substitutions θ so as to make two or more termssyntactically identical. It may be thought of as a special form of equation solving where one attemptsto find solutions to the problem s ≡ t by finding suitable instances of the variables in the two termsin order to make the two terms look identical. The solution of such an equation on essentiallyuninterpreted terms is a substitution and the process of finding this solution is called unification. Asin normal equation solving the substitution is to be applied to all the terms that have to be unified.Moreover as in equation solving, it is possible that no solution exists. A set consisting of two ormore terms is said to be unifiable if such a substitution exists.

We will use words like “occurrence”, “sub-term”, “depth”, “size” quite liberally. In the light ofthe presence of several occurrences of operators, free variables and bound variables (including dif-ferent bound variable occurrences signifying different variables but possessing the same name e.g.(λx[(xx)] λx[(xx)])) in a term, it is useful to define a unique position for each symbol in a term t.For any term t we have a set of strings Pos(t) ⊆ N∗ which is the set of positions occurring in t.Further for each p ∈ Pos(t), there is a unique symbol occurring at that position denoted by pos(p, t).

Definition 28.1




t depth size ST pos

c 1 1 t εx 1 1 t εo(t1, . . . , tn) 1 + Maxni=1depth(ti) 1 +

∑ni=1 size(ti) t ∪

⋃ni=1 ST (ti) ε ∪

⋃ni=1 i.pos(ti)

• The functions given in the table above are defined by induction on the structure of term t. ε is theempty word on strings, . is the catenation operator on strings and i.pos(ti) = i.p | p ∈ pos(ti).

• s v t iff s ∈ ST (t) is the subterm relation on terms. s is a proper subterm of t (denoted s @ t)iff s v t and s 6≡ t.

• For any t, the subterm at position p ∈ pos(t) is denoted t|p and defined by induction on p asfollows: t|ε ≡ t, and for t ≡ o(t1, . . . , tn), t|i.p′ ≡ ti|p if p = i.p′ ∈ pos(t)

• For any term t and any position p ∈ pos(t), sym(p, t) yields the symbol at position p in the termt.

• The position ε is called the root position and the symbol at the root position is called theroot symbol. Hence rootsym(t) = sym(ε, t) and for any position p ∈ pos(t), sym(p, t) =

rootsym(t|p).




• The set of occurrences of a symbol σ ∈ Σ∪V in a term is defined as the setOcc(σ, t) of positionsin which that symbol occurs i.e. Occ(σ, t) = p ∈ pos(t) | sym(p, t) = σ.

Facts 28.2 For any term t and positions p, q ∈ pos(t), t|q @ t|p iff p ≺ q.




UnifiabilityDefinition 28.3 A nonempty finite set of terms ti | 1 ≤ i ≤ n,n > 1 is said to be unifiable if there exists a substitution θ suchthat

θt1 ≡ θt2 ≡ · · · ≡ θtn

θ is called a unifier of ti | 1 ≤ i ≤ n




Unification Examples:1Example 28.4 Let f and g be distinct binary operators andx, y, v, w ∈ V .1. The terms f (x, y) and f (v, w) may be unified by the substitu-

tion θ = x/v, y/w since θf (v, w) ≡ f (x, y) ≡ θf (x, y). Theymay also be unified by θ−1 = v/x, w/y.

2. Let r, s, t be any three terms. Then f (x, y)and f (v, w) may be unified by the substitutionτ = g(s, t)/v, f (r, r)/w, g(s, t)/x, f (r, r)/y.

3. The terms f (x, y) and f (y, x) may be unified by χ = x/ysince χf (x, y) ≡ f (x, x) ≡ χf (y, x).




Unification Examples:2Example 28.5 Let f and g be distinct binary operators.1. The terms f (x, y) and g(x, y) cannot be unified by any substi-

tution.2. The terms f (x, y) and f (y, x) cannot be unified by ρ =x/y, y/x since ρf (x, y) ≡ f (y, x) and ρf (y, x) ≡ f (x, y).Hence ρf (x, y) 6≡ ρf (y, x).




The following facts are easy to prove and may be used without any mention of them in the sequel.

Fact 28.6 Let s and t be any two terms and let p ∈ pos = pos(s) ∩ pos(t). Then

1. If s|p ≡ t|p for any position p ∈ pos then for all positions p, q ∈ pos, p q implies s|q ≡ t|q.

2. If rootsym(s|p) = o1 6≡ o2 = rootsym(t|p) then s and t are not unifiable under any substitution.

3. If s and t are unifiable then for every position p ∈ pos, rootsym(s|p) 6≡ rootsym(t|p) implies atleast one of the symbols is a variable i.e. rootsym(s|p), rootsym(t|p) ∩ V 6= ∅

Exercise 28.1

1. Generalize the fact 28.6 to nonempty finite sets of terms.

2. Construct an example to show that the converse of fact 28.6.3 does not hold.




Generality of UnifiersThere is a certain sense in which θ may be regarded as beingmore general than τ in example 28.4.

Definition 28.7• A substitution θ is at least as general as another substitutionτ (denoted θ & τ ) if there exists a substitution χ such thatτ = χ θ.• θ ∼ τ if θ & τ & θ.• θ is strictly more general than τ (denoted θ τ ) if θ & τ andτ 6& θ.




Generality: FactsFact 28.81.& is a preordering relation on SΩ(V ) i.e. it is a reflexive and

transitive relation.2. is an irreflexive and transitive relation on SΩ(V ).3.∼ is an equivalence relation on SΩ(V ).




Most General UnifiersDefinition 28.9 Let T = ti | 1 ≤ i ≤ n be a unifiable setof terms. A substitution θ is called a most general unifier (mgu)of T if for each unifier τ of T , there exists a substitution ρ suchthat τ = ρ θ.Fact 28.101. If a set of terms T is unifiable then it has a mgu.2. If θ and τ are both mgu’s of a set T then θ ∼ τ .3. If θ and τ are both mgu’s of a set T then there exist (pure-

variable) substitutions ρ, ρ−1 : V −→ V such that ρ θ = τand θ = ρ−1 τ




More on PositionsFor any nonempty set of terms T , we have

Pos(T ) =⋂t∈Tpos(t) 6= ∅

and for any position p ∈ Pos(T ),

T |p = t|p | t ∈ T

androotsym(T |p) = rootsym(t|p) | t ∈ T




Disagreement SetDefinition 28.11 Given a set T (|T | > 1) of terms (also viewedas a set of abstract syntax trees), the disagreement set of T isdefined as the set T |q of subterms rooted at some position qsuch that1. not all the terms in T |q have the same root symbol and2. for every p ≺ q, |rootsym(T |p)| = 1, where ≺ is the proper-

prefix relation on strings.




We have seen that pos(t) for any term t is partially ordered by the relation≺which inverts the propersub-term ordering on ST (t) (Fact 28.2). For the purpose of specifying the unification algorithm, itis useful to define a total order < on the positions of terms which is consistent with ≺. Intuitively ifu = o(t1, t2 . . . , tn) we would like to specify recursively that

• the root position u precedes the root positions of all the subterms t1, . . . , tn. (which is taken careof by the prefix ordering ≺ on positions) i.e. ε < i for all 1 ≤ i ≤ n

• for each i, j such that 1 ≤ i < j ≤ n, the position of the root of ti precedes that of tj in the totalordering.

• If i < j, then the position of the root of any proper subterm of ti precedes the position of anysubterm of tj (including the root).

Definition 28.12 For any positions p, q ∈ pos(t), p < q iff one of the following conditions holds.

• p = ε 6= q or

• (p = i.p′, q = i.q′ and p′ < q′) or

• (p = i.p′, q = j.q′ and i < j).




If all operators in Ω are always used in prefix form then each term may also be regarded as a stringin (Σ ∪ (, ))∗. The ordering < on pos(t) simply becomes the left-to-right ordering of symbols inthe well-formed terms of TΩ(V ) represented as strings.




Algorithm: Computing a Disagreement SetRequire: |T | > 1 At least two different terms

1: DISAGREEMENT(T )df= DISAGREE(T, ε, Pos(T )− ε) where

2: Pos(T ) =⋂t∈T Pos(t) At least ε ∈ Pos(T ) and

3: DISAGREE(T, p, P )df=

4: if |rootsym(T |p)| = 1 then5: if P 6= ∅ then6: let p′ = Min(P );P ′ = P − p′ in7: DISAGREE(T, p′, P ′)8: end let9: else There is no disagreement

10: return fail




11: end if12: else A disagreement has been found at position p13: return T |p14: end if

The function Min used in the algorithm above is the minimum position with respect to the totalordering < on positions (definition 28.12) in a term.




Example: Disagreement 1Example 28.13 Consider the set of terms

S1 = f (a, x, h(g(z))), f (z, h(y), h(y))

where a is a constant, f is a ternary operator and g and h areunary operators. In this case, reading the terms from left toright we get a disagreement set D1 = a, z. On the otherhand, reading from right to left we obtain the disagreement setD′1 = g(z), y which requires going down one level deeper.

The algorithm however will compute the leftmost disagreementD1 always.




Example: Occurs CheckExample 28.14 Consider the set

S2 = f (g(z), x, h(g(z))), f (z, h(y), h(y))

The disagreement set g(z), z is such that S2 is not unifiable,because for any substitution θ of θz can never be syntacticallyidentical with θg(z). This is an example of the notorious occurscheck problem. Hence S2 is not unifiable.




Example: Disagreement 3Example 28.15 Consider the set

S3 = f (a, x, h(g(z))), f (b, h(y), h(y))

where a and b are both constant symbols. Here a disagreementset is D3 = a, b. Again it is clear that S3 is not unifiable.




Example: Disagreement 4Example 28.16 Consider the set

S4 = f (h(z), x, h(g(z))), f (g(x), h(y), h(y))

Here we have a disagreement set D4 = h(z), g(x). Sinceh(z) cannot be unified with g(x) under any substitution of freevariables, S4 is not unifiable.




Disagreement and UnifiabilityFact 28.17 If S′ is the disagreement set of S then1. S is unifiable implies S′ is unifiable.2. If S is unifiable and θ′ is a mgu of S′ then there exists a mguθ of S such that θ′ & θ.




The above facts reduce the problem of finding a unifier if it exists, to that of systematically findingdisagreement sets and unifying them.

Finding a unifier for a disagreement set is a pre-requisite for finding a unifier for the original set ofterms. A disagreement set consists of subterms of the original set of terms at a particular positionsuch that at least two distinct (sub-)terms exist in the set. Further a disagreement set is unifiableonly if there is at most one non-variable term in it. By choosing a substitution t/x where both tand x are terms in the disagreement set satisfying the condition x 6∈ FV (t), there is a possibility ofunifying the disagreement set. The algorithm constructs a sequence of singleton substitutions whosecomposition yields a most general unifier if it exists.




Algorithm: UnificationRequire: S ⊆f TΩ(V ) and |S| > 1Ensure: If S is not unifiable then fail else ∃θ ∈ SΩ(V ) : |θS| = 1

and θ is a mgu of S

1: UNIFY(S)df= PARTIALUNIFY(1, S) where

2: PARTIALUNIFY(θ, S)df=

3: let T = θS in4: if |T | = 1 then5: return θ θ is a mgu of S6: else There is a position at which at least two terms are differ-

ent7: let D = DISAGREEMENT(T ) in




8: if ∃x ∈ D ∩ V : ∀t ∈ T [x 6∈ FV (t)] then9: Choose t ∈ T : x 6∈ FV (t)

10: PARTIALUNIFY(t/x θ, S)T = θS ∧ |t/xT | < |T ||(t/x θ)S| < |θS| ≤ |S|

11: else Occurs check fails so S is not unifiable12: return fail13: end if14: end let15: end if16: end let




Example 28.18 Consider the set S = S1 in example 28.13. Starting with θ0 = 1 we go through thefollowing steps to obtain a unifier of S.

i θi θiS Di

0 θ0 = 1 θ0S = f (a, x, h(g(z))), f (z, h(y), h(y) D0 = a, z1 θ1 = a/z θ0 θ1S = f (a, x, h(g(z))), f (a, h(y), h(y) D1 = x, h(y)2 θ2 = h(y)/x θ1 θ2S = f (a, h(y), h(g(z))), f (a, h(y), h(y) D2 = g(z), y3 θ3 = g(z)/y θ2 θ3S = f (a, h(g(z)), h(g(z)) D3 = ∅

Hence the required unifier is θ3 = g(z)/y h(y)/x a/z 1 = a/z, h(g(z))/x, g(z)/y.

Example 28.19 Let S = f (y, z, w), f (g(x, x), g(y, y), g(z, z)) where f is a ternary operator andg is a binary operator. An attempt to apply the algorithm yields the following sequence of substitu-tions: θ1 = g(x, x)/y from which we get θ1S = f (g(x, x), z, w), f (g(x, x), g(g(x, x), g(x, x)), g(z, z))and then θ2 = g(g(x, x), g(x, x))/z which yields

θ2S = f (g(x, x), g(g(x, x), g(x, x)), w),

f (g(x, x), g(g(x, x), g(x, x)), g(g(g(x, x), g(x, x)), g(g(x, x), g(x, x)))) and finallyθ3 = g(g(g(x, x), g(x, x)), g(g(x, x), g(x, x)))/w θ2




Hence in general there are pathological cases which make the algorithm very expensive to run,having a complexity that is exponential in the length of the input i.e. to unify the set

f (x1, . . . , xn), f (g(x0, x0), · · · , g(xn−1, xn−1))

would require a substitution that has 2k − 1 occurrences of the symbol g in the substitution of thevariable xk.




The Unification TheoremTheorem 28.20 (The Unification Theorem) The unification algo-rithm terminates satisfying its postcondition (If S is not unifiablethen fail else ∃θ ∈ SΩ(V ) : |θS| = 1 and θ is a mgu of S) forany set of terms satisfying its preconditions (S ⊆f TΩ(V ) and|S| > 1).

2





Proof:

Claim. Termination` Let V0 =

⋃s∈S FV (s) be the (finite) set of free variables occurring in S. With each execu-

tion of line 8 in the unification algorithm either it terminates because it fails or a substitutiont/x such that x 6∈ FV (t) is generated with θk+1 = t/x θk, we have θk+1S has onevariable less than θkS. Since V0 is finite the algorithm must terminate. a

Claim. If S is not unifiable then it terminates returning fail.` Trivial. a

Claim. If S is unifiable then it terminates returning a most general unifier θ` Let ρ be any unifier of S, and let 1 = θ0, θ1, · · · , θk = θ be the sequence of substitutionsgenerated by the algorithm. We prove by induction that for every θi, there exists a substitutionτi such that ρ = τi θi.Basis. For i = 0 clearly τ0 = ρ.





Assume for some j, 0 ≤ j < k, there exists τj such that ρ = τj θj.

Induction Step. Clearly since θjS is not a singleton, a disagreement set Dj will be found forθjS. Since ρ = τj θj is a unifier of S, clearly τj must unify Dj, which means there existsa variable x and a term t with x 6∈ FV (t) such that τj unifies Dj which in effect impliesτjx = τjt. Without loss of generality we may assume t/x is the chosen substitution sothat θj+1 = t/x θj. Now define τj+1 = τj − τjx/x.Case x ∈ dom(τj). Then τj = τjx/x ∪ τj+1 = τjt/x ∪ τj+1. Since x 6∈ FV (t), wehave τj = τj+1t/x∪ τj+1 = τj+1 t/x by the definition of composition. Finally fromρ = τj θj we get ρ = τj+1 t/x θj = τj+1 θj+1.Case x 6∈ dom(τj). Then τj = τj+1 and each element of Dj is a variable and τj = τj+1 t/x. Thus ρ = τj θj = τj+1 t/x θj = τj+1 θj+1 as required.

Since for any unifier ρ of S there exists τk such that ρ = τk θk, θk must be a mgu of S. a

QED




Exercise 28.2

1. Generalize the facts 28.6 to a set S of terms where |S| ≥ 2.

2. Identify the relationships among the different substitutions θ, θ−1, τ , χ, ρ and ρ−1 in examples28.4 and 28.5

3. Let D be the disagreement set of S.

(a) Can |D| be different from |S|? Justify your answer.

(b) If S = D then under what conditions is S unifiable?

(c) If S 6= D then what can you say about the depths of terms in D as compared to the depths ofterms in S?

4. Construct an example of a set S of terms with disagreement set D in which there exist a variablex and a term t such that x ∈ FV (t) and yet the set S is unifiable.

5. Prove that if S is unifiable then the mgu computed by the unification algorithm is idempotent.







29. Lecture 29: Resolution in FOL

Lecture 29: Resolution in FOLFriday 14 October 2011




1. Recapitulation

2. SCNFs and Models

3. SCNFs and Unsatisfiability

4. Representing SCNFs

5. Clauses: Terminology

6. Clauses: Ground Instances

7. Facts about Clauses

8. Clauses: Models

9. Clauses: Herbrand’s Theorem

10. Resolution in FOL




Recapitulation1. For any set Φ ∪ ψ (where Φ may or may not be empty) of

closed Σ-formulae Φ |= ψ iff Φ ∪ ¬ψ is unsatisfiable.2. A non-empty set Φ of closed Σ-formulae is unsatisfiable iff it

contains a non-empty finite unsatisfiable subset.3. A set Φ of closed Σ-formulae has a model iff it has a Herbrand

model4. A non-empty finite set Φ of closed Σ-formulae is unsatisfiable

iff the formula ψ ≡∧φ∈Φ φ is unsatisfiable.




SCNFs and Models1. A closed Σ-formula ψ has a model iff the universally closed

Σ-formula sko(ψ) has a model.2. Every closed Σ-formula ψ may be transformed into an “equi-

satisfiable” (universally closed) formula in SCNF.3. A universally closed Σ-formula ψ in SNF has model iff g(ψ)

the ground-instances (see definition 26.6) of ψ has a model.




SCNFs and Unsatisfiability1. A closed Σ-formula ψ and the (universally closed) Σ-formulasko(ψ) are “equi-unsatisfiable”.

2. A universally closed Σ-formula ψ in SNF is unsatisfiable iffg(ψ) is unsatisfiable.

3. Since g(ψ) consists of only closed formulae, g(ψ) is unsatisfi-able iff there is a finite subset of g(ψ) which is unsatisfiable.




Representing SCNFsDefinition 29.1 Let the SCNF sko(ψ) be represented by a set

sko(ψ) = Ci | 1 ≤ i ≤ m

such thatsko(ψ) ≡ ~∀[

∧1≤i≤m

Ci]

where each (quantifier-free) conjunct Ci, 1 ≤ i ≤ m,

Ci ≡∨

1≤j≤ni

λj

is called a clause and is represented by a set

Ci = λj | 1 ≤ j ≤ ni

of literals.




Clauses: TerminologyDefinition 29.21. A clause is a finite set of literals.2. The empty clause is the empty set of literals ().3. A ground clause is a clause with no occurrences of variables.4. For any substitution θ, and clause C = λj | 1 ≤ j ≤ n,θC = θλj | 1 ≤ j ≤ n.

Compare with clauses in propositional logic




Clauses: Ground InstancesDefinition 29.31. The set of ground instances of a clause C is the set

g(C) = θC | θ is a ground substitution

2. For any set S = Ci | 1 ≤ i ≤ m of clauses, the set ofground instances of S is the set

g(S) =⋃C∈Sg(C)




Facts about ClausesLemma 29.4 Let Ci | 1 ≤ i ≤ m be a set of clauses. Then

~∀[∧

1≤i≤mCi]⇔

∧1≤i≤m

~∀[Ci]

Proof: Follows from the semantics of ∀ and ∧ or alternativelyfrom corollary 25.2. QED Notice that even if there are free variables common betweentwo clauses, this lemma holds, mainly because of the fact thatthere are no existential quantifiers. For example

∀x[p(x) ∧ q(x)]⇔ ∀x[p(x)] ∧ ∀y[q(y)]⇔ ∀x, y[p(x) ∧ q(y)]




Clauses: ModelsDefinition 29.51. A structure A is a model of a• clause C = λj | 1 ≤ j ≤ n (denoted A C) iff n > 0

and A ~∀[∨

1≤j≤n λj].• a set S of clauses (denoted A S) if it is a model of every

clause in S.2. S |= C iff every model of S is also a model of C.

Note: An empty clause has no models.




Clauses: Herbrand’s TheoremProposition 29.61. A set S of clauses possesses a model iff every finite subset

of S possesses a model.2. A set S of clauses is unsatisfiable iff there is a finite subsetS′ ⊆f S of clauses which is unsatisfiable iff g(S′) does notpossess a model.




Resolution in FOLCompare with resolution in propositional logic

Let S be a set of clauses, Ci, Cj ∈ S with i 6= j, FV (Ci) ∩FV (Cj) = ∅ and p an atomic predicate symbol such that

•Ci = C ′i ∪p( ~si′) | 1 ≤ i′ ≤ mi and Cj = C ′j ∪¬p( ~tj′) | 1 ≤j′ ≤ mj• L = p( ~si′) | 1 ≤ i′ ≤ mi ∪ p( ~tj′) | 1 ≤ j′ ≤ mj is a set of

unifiable literals.• µ = UNIFY(L) is an mgu of L•C ′ij = µ(C ′i ∪ C

′j) = (µC ′i) ∪ (µC ′j) is called the resolvent of Ci

and Cj.

Res1S

(S − Ci, Cj) ∪ C ′ij







30. Lecture 30: More on Resolution in FOL

Lecture 30: More on Resolution inFOL

Tuesday 18 October 2011




1. Standardizing Variables Apart

2. Factoring

3. Example: 1

4. Example: 2

5. Refutation




Standardizing Variables Apart1. The free variables of distinct clauses must be disjoint and vari-

ables should be renamed if necessary.

Example 30.1 Let S = C1, C2 where

C1 = p(x)C2 = ¬p(f (x))

S represents the set of closed formulae

∀x[p(x)],∀x[¬p(f (x))]

which is clearly unsatisfiable. However the empty clausecannot be derived (because of the occurs check) unless x isrenamed in one of the clauses.




Factoring2. Unlike in the case of propositional resolution it should be pos-

sible to eliminate several literals at once

Example 30.2 Consider the set S = C1, C2 where

C1 = p(x), p(y)C2 = ¬p(u),¬p(v)

S is clearly unsatisfiable but by removing only one literal at atime with the substitution x/u yields the new clause

C ′12 = p(y),¬p(v)

from which the empty clause cannot be derived.




Example: 1Example 30.3 Consider the set S = C1, C2 where

C1 = ¬q(x, y),¬q(y, z), q(x, z)≡ ∀x, y, z[(q(x, y) ∧ q(y, z))→ q(x, z)]

which represents transitivity and

C2 = ¬q(u, v), q(v, u)≡ ∀u, v[q(u, v)→ q(v, u)]

which represents symmetry.A logical consequence of these two properties is the propertyderived below.




C1 = ¬q(x, y),¬q(y, z), q(x, z), ¬q(u, v), q(v, u) = C2µ = z/u, y/v¬q(x, y), q(x, z),¬q(z, y) = C ′12

C ′12 = ¬q(x, y),¬q(z, y), q(x, z)≡ ∀x, y, z[(q(x, y) ∧ q(z, y))→ q(x, z)]




Example: 2Example 30.4 Suppose we need to prove that if a binary rela-tion p is reflexive

φp−reflexivedf= ∀x[p(x, x)]

and euclidean

φp−euclideandf= ∀x, y, z[(p(x, y) ∧ p(x, z))→ p(y, z)]

then it is also symmetric

φp−symmetricdf= ∀x, y[p(x, y)→ p(y, x)]




After renaming bound variables and converting into SCNF to get theclauses

C1 = p(x, x)C2 = ¬p(u, v),¬p(u,w), p(v, w)

The mgu µ = x/u, x/w yields the required clause

C ′12 = ¬p(x, v), p(v, x)




RefutationRefutation in propositional logic

Example 30.5 We could also prove symmetry in example 30.4by a refutation as follows. Taking the negation of the conclusionwe get

¬φp−symmetry⇔ ∃u, v[p(u, v) ∧ ¬p(v, u)](Sko) p(a, b) ∧ ¬p(b, a)≡ p(a, b), ¬p(b, a)df= C3, C4

where a and b are skolem constants.




C1 = p(x, x), C2 = ¬p(u, v),¬p(u,w), p(v, w), C3 = p(a, b), C4 = ¬p(b, a)µ = x/u, x/wC ′12 = ¬p(x, v), p(v, x), C3 = p(a, b), C4 = ¬p(b, a)

µ′ = a/x, b/vC ′124 = ¬p(a, b), , C3 = p(a, b), µ′′ = 1

Alternatively a proof starting with the clauses C3 and C4 works too.

C1 = p(x, x), C2 = ¬p(u, v),¬p(u,w), p(v, w), C3 = p(a, b), C4 = ¬p(b, a)θ1 = b/v, a/wC1 = p(x, x), C3 = p(a, b), C ′24 = ¬p(u, b),¬p(u, a)

θ2 = a/uC1 = p(x, x), C ′234 = ¬p(a, a)θ3 = a/x

Exercise 30.1

1. Prove the conclusion of example 30.3 by a refutation.

2. Are there any other unifiers by which symmetry may be proved in example 30.4?




3. Try a refutation proof using ν = x/u, x/w, x/v as the first unifier in example 30.5. Whydoesn’t it work?

4. Try a refutation proof using ν = x/u, x/w, y/v as the first unifier in example 30.5. Why doesit work?







31. Lecture 31: Resolution: Soundness and Completeness

Lecture 31: Resolution: Soundnessand CompletenessWednesday 19 October 2011




1. Soundness of FOL Resolution

2. Ground Clauses

3. The Lifting Lemma

4. Lifting Lemma: Figure

5. Completeness of Resolution Refutation: 1






Soundness of FOL ResolutionLemma 31.1 The resolvent C ′ij obtained by resolving theclauses Ci and Cj in the resolution method is a logical con-sequence of the set Ci, Cj.

2The following theorem then follows.

Theorem 31.2 If S′ is the set of clauses obtained by a singleapplication of the resolution rule Res1, then S |= S′.

Corollary 31.3 If the empty clause is derivable from a set S ofclauses, then S is unsatisfiable.




Proof of lemma 31.1

Proof: Assume

• Ci = C ′i ∪ p(~si′) | 1 ≤ i′ ≤ mi,• Cj = C ′j ∪ ¬p(~tj′) | 1 ≤ j′ ≤ mj,• FV (Ci) ∩ FV (Cj) = ∅ and

• L = p(~si′) | 1 ≤ i′ ≤ mi ∪ p(~tj′) | 1 ≤ j′ ≤ mj is a set of unifiable literals.

Let A Ci, Cj. Therefore

A ~∀[∨λi∈Ci

λi] (21)

A ~∀[∨λi∈Cj

λj] (22)

and for any substitution θ we have

A θ∨Ci (23)

A θ∨Cj (24)




If θ is a unifier of L and θL = λ we get

A ∨

(λ ∪ θC ′i) (25)

A ∨

(λ ∪ θC ′j) (26)

Let θC ′i = κi′ | 1 ≤ i′ ≤ k and θC ′j = λj′ | 1 ≤ j′ ≤ l. Then we have the following tablewhich shows a case analysis for the various values of k and l.

λ ∪ θC ′i ⇔ λ ∪ θC ′j ⇔ θC ′i ∪ θC ′j ⇔k = 0 = l λ λ k = 0, l > 0 λ λ→ (λ1 ∨ · · · ∨ λl) λ1 ∨ · · · ∨ λlk > 0, l = 0 λ→ (κ1 ∨ · · · ∨ κk) λ κ1 ∨ · · · ∨ κkk, l > 0 ¬(κ1 ∨ · · · ∨ κk)→ λ λ→ (λ1 ∨ · · · ∨ λl) ¬(κ1 ∨ · · · ∨ κk)→ (λ1 ∨ · · · ∨ λl)

It is easy to see that in each case

λ ∪ θC ′i, λ ∪ θC ′j |= θC ′i ∪ θC ′j (27)

It follows also from (21), (22) and (27) that A ~∀[∨C ′ij] and hence C ′ij is a logical consequence of

the set Ci, Cj. QED




Ground ClausesTheorem 31.4 (Completeness of Resolution Refutation forground clauses). Let G be a set of ground clauses. If G doesnot possess a model, the empty clause () may be derived byRes0.

2Here Res0 is the propositional resolution rule given by

Res0S

(S − Ci, Cj) ∪ C ′ij

where C ′ij = C ′i ∪C′j. Note that there is no substitution involved

anywhere since all clauses are ground.




Proof of theorem 31.4.

Proof: Let G = Ci | 1 ≤ i ≤ n, n > 0 be a set of n clauses. If ∈ G there is nothing toprove. So assume 6∈ G. Consider the following measure

#G = (∑

1≤i≤n|Ci|)− n

Clearly, #G = 0 iff every clause is made up of a single literal. We proceed to prove the theorem byinduction on #G.

Basis. #G = 0. Then each Ci = λi and G ≡∧

1≤i≤k

λi and by theorem 25.10, G is unsatisfiable iff

it contains a complementary pair. Clearly by rule Res0 the resolvent of this complementary pairis the empty clause.


For all k, 0 ≤ #G = k < m for some m > 0, if G does not possess a model, then the emptyclause is derivable from G.




Induction Step. Assume #G = m > 0. There must be at least one clause Ci which contains morethan one literal. So let Ci = λi ∪ Di with λi 6∈ Di 6= ∅. Let Gi1 = (G − Ci) ∪ Di andGi2 = (G − Ci) ∪ λi. Clearly #Gi1 < #G and #Gi2 < #G. Further if G does not have amodel then neither Gi1 nor Gi2 has a model (if either of them had a model then so would G sinceCi ≡ λi ∨

∨Di). By the induction hypothesis,

1. there exists a resolution proof R1 from Gi1 which derives the empty clause and2. there is another resolution proof R2 from Gi2 which also derives the empty clause.

Notice that since we are dealing only with ground literals, all resolvents are obtained by applyingthe rule Res0.Consider the proof R ′1 obtained from R1 by adding the literal λi to Gi1 and performing exactlythe same sequence of resolutions.

• Case 1. If the proof R1 did not involve the use of any of the literals from Di and the emptyclause was derived, then clearly the same sequence with λi included would also derive theempty clause and that completes the proof.• Case 2 On the other hand if one or more steps in proof R1 involved literals from Di then

the resulting proof R ′1 may derive the clause λi in place of the empty clause. Howeverwe do know that the empty clause is derived from Gi2 in proof R2. This implies there exist




resolution steps in R2 involving the literal λi which derive the empty clause. Therefore thereexists at least one clause containing the literal λi in the set of final clauses obtained in R ′1. Byapplying the resolution steps of R2 which do not appear anywhere in R ′1, the empty clausewould again be derived.

QED




The Lifting LemmaLemma 31.5 (Lifting Lemma). (see figure) Let C1 and C2 beclauses and let θ1, θ2, σ be substitutions such that• FV (C1) ∩ FV (C2) = ∅,• FV (θ1C1) ∩ FV (θ2C2) = ∅ and•C ′12 is the resolvent of θ1C1 and θ2C2 via a substitution σ, by

a single application of resolution.Then there exists a resolvent C12 of C1 and C2 by a single ap-plication of resolution via a substitution ρ and a substitution τsuch that C ′12 ≡ τC12.

2




Lifting Lemma: FigureC1 C2 C1 C2

C ′12

θ1C1 θ2C2

σ

ρ

C12

τ

C ′12

lifted to




Proof of lemma 31.5

Proof: LetC1 = C ′1 ∪ L1 where L1 = λi | 1 ≤ i ≤ m,m > 0C2 = C ′2 ∪ L2 where L2 = λj | 1 ≤ j ≤ n, n > 0

such that σ is a mgu of (θ1L1) ∪ (θ2L2) and C ′12 = σ((θ1C′1) ∪ (θ2C

′2)).

Since FV (C1) ∩ FV (C2) = ∅, dom(θ1) ∩ dom(θ2) = ∅ and since FV (θ1C1) ∩ FV (θ2C2) = ∅ wehave FV (ran(θ1)) ∩ FV (ran(θ2)) = ∅ and hence θ1C

′1 = (θ1 ∪ θ2)C ′1 and θ2C

′2 = (θ1 ∪ θ2)C ′2.

Since σ is a mgu of (θ1L1)∪(θ2L2) we have σ(θ1∪θ2) is a unifier of L1∪L2. If L1∪L2 is unifiable,then it has a most general unifier ρ & σ (θ1 ∪ θ2) such that C12 = ρ(C ′1 ∪C ′2) is the resolvent of C1

and C2.

ρ & σ (θ1 ∪ θ2) implies there exists a substitution τ such that

τ ρ = σ (θ1 ∪ θ2)

andC ′12 = τ (ρ(C ′1 ∪ C ′2)) = (σ (θ1 ∪ θ2))(C ′1 ∪ C ′2)

QED




A superposed version of the figure is shown below.

C1 C2

C ′12

τ12

θ1C1 θ2C2C12

ρ12

σ12




Completeness of ResolutionRefutation: 1

1. The lifting lemma helps us to use the completeness of reso-lution refutation for ground clauses and “lift” it to clauses withvariables.

2. By standardizing variables apart we may guarantee that theconditions of disjointness of free variables between differentclauses (lemma 31.5) may be enforced.

3. Any set of clauses S = Ci | 1 ≤ i ≤ m represents theconjunction of the universal closure of each clause.





1. The lifting lemma 31.5 guarantees that if the substitutions θ1and θ2 are ground, then there exists a corresponding groundsubstitution τ which produces the same effect after resolution.

2. By Herbrand’s theorem 26.7 a set Φ is unsatisfiable iff a finitesubset of ground instances of Φ is unsatisfiable.

3. To prove the completeness of resolution refutation it is suffi-cient to consider only the finite set of ground clauses from whichthe empty clause may be derived.





Theorem 31.6 (Completeness of Resolution Refutation). If a setΦ of clauses is unsatisfiable then the empty clause is derivablefrom Φ.

2





Proof: Without loss of generality we may assume that the variables in every clause are disjointfrom the variables occurring in any other clause.

By Herbrand’s theorem 26.7 there exists a finite set of ground clauses G = gCi | 1 ≤ i ≤ m ⊆fg(Φ) such that G is unsatisfiable. Each gCi ∈ G is obtained by a substitution on some clause i.e.gCi = θiCi for some substitution θi and some clause Ci ∈ Φ. Further for i 6= j, dom(θi)∩dom(θj) =

∅ and since all the clauses in G are ground the disjointness conditions of the lifting lemma aretrivially satisfied.

Each application of rule Res0 in G may be lifted to finding a mgu of the appropriate clauses. Thisfact may be proved by induction on the height of the resolution proof tree and we leave it as anexercise to the interested reader. However the following diagram illustrates it for two steps of aresolution proof tree.




C12

C1 C2

C ′12

τ12

θ1C1 θ2C2

ρ12

σ12

C3 C4

ρ34

C ′34

σ34τ34

θ4C4θ3C3 C34

σ1234

C ′1234

C1234

τ1234

ρ1234

QED







32. Lecture 32: Resolution and Tableaux

Lecture 32: Resolution and TableauxTuesday 01 November 2011




1. FOL: Tableaux

2. FOL: Tableaux Rules

3. FOL Tableaux: Example 1

4. First-Order Tableaux

5. FOL Tableaux: Example 2




FOL: Tableaux1. The tableau method in principle is similar to• natural deduction in its use of syntactical decomposition,• resolution in using unsatisfiability to prove validity.

2. The tableau method for both propositional and predicatelogic has some advantages over resolution.

3. FOL resolution requires formulae to be converted into PCNFand then SCNF before resolution may be applied.

4. As in the case of Natural Deduction the tableau method usesthe rules ∃E and ∀E to decompose quantified formulae alongwith the same restrictions.




FOL: Tableaux RulesBesides the usual tableau rules for the propositional connec-tives we have the following.

∀. ∀x[φ]

t/xφ¬∀. ¬∀x[φ]

¬a/xφ

∃. ∃x[φ]

a/xφ¬∃. ¬∃x[φ]

¬t/xφ1. The restrictions on the use of a constant symbol a in both

rules ¬∀. and ∃. are the same as those for ∃E in both H1 andG1.

2. Since ∀E holds for all terms t, the rules ∀. and ¬∃. mayhave to be applied several times before unsatisfiability canbe proven.




FOL Tableaux: Example 1Example 32.1 Let c be a constant symbol and f a unary func-tion symbol. Then Φ = ¬p(c), p(f (f (c))),∀x[p(x) ∨ ¬p(f (x))]is unsatisfiable.

¬p(c)p(f (f (c)))

∀x[p(x) ∨ ¬p(f (x))]p(c) ∨ ¬p(f (c))

p(c) ¬p(f (c)) p(f (c)) ∨ ¬p(f (f (c)))

p(f (c)) ¬p(f (f (c)))

Notice the two applications of rule ∀E. indicates a closedpath in the tableau.




First-Order Tableaux1. Unlike propositional tableaux, any satisfiable set Φ of quanti-

fied formulae can potentially yield an infinite tableau, since aformula of the form ∀x[φ] ∈ Φ can have an infinite number ofinstances.

2. For unsatisfiable sets, closed finite tableaux may be con-structed by applying the following heuristics•Whenever possible apply propositional rules before apply-

ing quantifier rules• Apply rules ∃. and ¬∀. before applying ∀. and ¬∃. in order

to direct the proof towards a propositional contradiction.




FOL Tableaux: Example 2We prove that ∀x[p(x)→ q(x)] |= ∀x[p(x)]→ ∀x[q(x)]

1. ∀x[p(x)→ q(x)]2. ¬(∀x[p(x)]→ ∀x[q(x)])3. ∀x[p(x)] (¬ → on 2.)4. ¬∀x[q(x)] (¬ → on 2.)5. ¬q(a) (¬∀ on 4.)6. p(a) (∀ on 3.)7. p(a)→ q(a) (∀ on 1.)

8. ¬p(a) (¬ → on 7.) 9. q(a) (¬ → on 7.)10. (6., 8.) 11. (5., 9.)










33. Lecture 33: Completeness of Tableaux Method

Lecture 33: Completeness of TableauxMethod

Wednesday 02 November 2011




1. First-order Hintikka Sets

2. Hintikka’s Lemma for FOL

3. First-order tableaux and Hintikka sets

4. Soundness of First-order Tableaux

5. Completeness of First-order Tableaux




First-order Hintikka SetsDefinition 33.1 A finite or infinite set Γ is a first-order Hintikkaset with respect to P1(Σ) if• 1-3. Γ is a (propositional) Hintikka set (definition 9.7) such

that all the atomic propositions are ground.• 4.

– ∀x[φ] ∈ Γ implies t/xφ ∈ Γ,– ¬∃x[φ] ∈ Γ implies ¬t/xφ ∈ Γ

for every t ∈ T0(Σ) (ground term).• 5.

– ∃x[φ] ∈ Γ implies t/xφ ∈ Γ,– ¬∀x[φ] ∈ Γ implies ¬t/xφ ∈ Γ

for at least one t ∈ T0(Σ) (ground term).




Hintikka’s Lemma for FOLLemma 33.2 If Σ contains at least one constant symbol, thenevery first-order Hintikka set with respect to P1(Σ) is satisfiablein a Herbrand model.

Proof: We define a Herbrand interpretation of the formu-lae as follows. For each n-ary atomic predicate symbol p,p(t1, . . . , tn) for ground terms t1, . . . , tn is true if and only ifp(t1, . . . , tn) ∈ Γ. By the definition of a Hintikka set we knowp(t1, . . . , tn),¬p(t1, . . . , tn) 6⊆ Γ. Hence all the atomic sen-tences in Γ are satisfiable under any valuation vH. We maythen proceed to show by structural induction on each φ ∈ P1(Σ)that φ ∈ Γ implies H φ. QED




First-order tableaux and Hintikka setsLemma 33.3 If a tableau rooted at a closed formula φ has anopen path then the set of formulae on the path form a first-orderHintikka set.

Proof: We may prove that each rule in Tableaux Rules andFOL: Tableaux Rules creates a path for the construction of Hin-tikka sets. QED




Soundness of First-order TableauxTheorem 33.4 (Soundness of First-order Tableau Rules). Ifthere is a closed tableau rooted at a closed formula ¬φ then |= φ.

Proof: Suppose there is a closed tableau rooted at ¬φ. Thenevery path in the tableau is closed because of the occurrenceof a complementary pair in the path. On the other hand if 6|= φ,i.e. φ is not valid, then ¬φ is satisfiable, which implies thatthere is an open path in the tableau rooted at ¬φ, clearly acontradiction. QED




Completeness of First-order TableauxTheorem 33.5 (Completeness of First-order Tableaux). If aclosed formula φ is valid, then there exists a closed tableaurooted at ¬φ.

Proof: If there is no closed tableau rooted at ¬φ then thereexists at least one open path in each such tableau. The set offormulae on this path form a Hintikka set and hence they areall simultaneously satisfiable, which implies there is a Herbrandmodel satisfying ¬φ, in which case 6|= φ. QED










34. Lecture 34: Completeness of the Hilbert System

Lecture 34: Completeness of theHilbert System





1. Deductive Consistency

2. Models of Deductively Consistent Sets

3. Deductive Completeness

4. The Completeness Theorem




Deductive ConsistencyDefinition 34.1 A set Φ ⊆ L1(Σ) is deductively consistent iffthere does not exist a formula φ such that Φ `H1

φ andΦ `H1

¬φ.

This definition is equivalent to other possible definitions suchas those given below which may all be derived from rule ⊥.Lemma 34.2 The following statements are equivalent.1. Φ ⊆ L1(Σ) is deductively consistent.2. There does not exist a formula ψ such that Φ `H1

¬(ψ → ψ)

3. There exists a formula which is not provable.




34.1. Model-theoretic and Proof-theoretic Consistency

We have earlier defined the notion of consistency of a set of formulae in propositional logic (seealso lemma 11.1) leading to the notions of maximal consistency and Lindenbaum’s theorem whichenabled us to extend a consistent set of propositions to a maximally consistent one. Later we havealso defined the notion of consistency of sets of predicate logic formulae in definition 20.2. Noticethat definition 9.1 though worded differently, also states that a set of propositions is consistent onlyif it has a model. The notion of a model in sentential logic however, refers to the existence of atruth assignment under which all the sentences are true (simultaneously). Hence both in sententialand predicate logic the notion of consistency refers to the existence of a model. These notions ofconsistency are model-theoretic since they are intimately associated with the existence of a model.

We have reserved the term “deductive consistency” (definition 34.1) to a proof-theoretic notion ob-tained from deductions rather than models. A priori there is no reason to believe that the two notionsare equivalent unless we can prove that our deductive system is sound and complete. While sound-ness has been proven we need to prove completeness before claiming that the model-theoretic notionof consistency and the proof-theoretic one are equivalent.

We need to carry our analogies between model-theory and proof theory a little further to the domain




maximal consistent sets (indeed some of the proof ideas would be analogous too!) in order to be ableto prove the completeness of the system H1. We refer to such maximally consistent sets obtainedthrough deductions as being deductively complete. The main difference however, is that we restrictourselves to only closed formulae as will be evident soon.




Models of Deductively Consistent SetsLemma 34.3 If Φ ⊆ L1(Σ) has a model then it is deductivelyconsistent.

Proof: Let A ψ for each ψ ∈ Φ. If Φ is not deductivelyconsistent, then there exists φ such that Φ `H1

φ and Φ `H1¬φ.

However since H1 is sound it follows that A φ and A ¬φwhich is a contradiction. QED




Deductive CompletenessLemma 34.4 For any Φ ⊆ L1(Σ), Φ `H1

φ iff Φ `H1∀[φ] iff

Φ ∪ ¬∀[φ] is not deductively consistent.

2We restrict our attention to only deductively consistent andcomplete sets.Definition 34.5 A (deductively consistent) set Φ ⊆ L1(Σ) is de-ductively complete iff for every closed formula φ, Φ `H1

φ orΦ `H1

¬φ.




Proof of lemma 34.4

Proof:

• Φ `H1 φ iff Φ `H1 ∀[φ] is obvious from rules ∀I and ∀E.

• (⇒) Suppose Φ `H1 φ. Then by monotonicity (theorem 13.1) Φ ∪ ¬∀[φ] `H1 φ and henceΦ ∪ ¬∀[φ] `H1 ∀[φ]. Further since Φ ∪ ¬∀[φ] `H1 ¬∀[φ], it follows that Φ ∪ ¬∀[φ] is notdeductively consistent.(⇐) Suppose Φ ∪ ¬∀[φ] is not deductively consistent. Then there exists a formula (by lemma34.2) ψ such that Φ ∪ ¬∀[φ] `H1 ¬(ψ → ψ). From `H1 ψ → ψ and ⊥ we obtain Φ ∪¬∀[φ] `H1 ∀[φ]. By corollary 23.4 (Deduction theorem for closed formulae) we obtainΦ `H1 ¬∀[φ]→ ∀[φ]. It follows from the derived axiom C2 that Φ `H1 (¬∀[φ]→ ∀[φ])→ ∀[φ]

from which we obtain Φ `H1 ∀[φ] by a single application of MP.

QED

Notes on proof of lemma 34.4.

1. The univeral closure is required in the lemma, because in general the inconsistency of Φ∪ ¬φdoes not imply Φ `H1 φ.




Example 34.6 Let Φ = ¬∀x[¬p(x)] and φ ≡ p(x). Then Φ ∪ ¬p(x) is inconsistent. How-ever, it is not possible to prove Φ `H1 p(x).

2. Hence the maximally consistent sets of propositional logic translate into deductively completesets in FOL. And this maximal completeness can only be shown for closed formulae and not forarbitrary formulae with free variables.

3. Clearly deductive completeness therefore is restricted to closed formulae.

The following is the proof-theoretic analogue of Lindenbaums theorem. Even the proof of thetheorem mirrors the alternative proof of Lindenbaum’s theorem.

Theorem 34.7 (The Extension Theorem) Every deductively consistent set may be extended to adeductively complete set.

Proof: Let Φ be a nonempty deductively consistent set of Σ-formulae. For any enumeration ofclosed Σ-formulae

ψ1, ψ2, ψ3, . . . (28)




define the chain of sets Φ0 ⊆ Φ1 ⊆ Φ2 ⊆ · · · starting with Φ0 = Φ as follows:

Φi+1 =

Φi if Φi `H1 ψiΦi ∪ ¬ψi otherwise

Claim. Each Φi is deductively consistent.` By induction on i. For i = 0, Φ0 = Φ is given to be deductively consistent. AssumingΦi is deductively consistent, we have that if Φi+1 = Φ it is obviously deductively consistent.Otherwise Φi+1 = Φ ∪ ¬ψi and by lemma 34.4 since Φi 6`H1 ψi and ψ is a closed formula,Φi+1 must be deductively consistent. a

Then Φ∞ =⋃i≥0

Φi is the desired set.

Claim. Φ∞ is deductively consistent.` Suppose not. Then by lemma 34.2, for some formula φ, we have Φ∞ `H1 ¬(φ→ φ).However since such a proof is finite there exists a finite subset Ψ ⊆f Φ, such that Ψ `H1

¬(φ→ φ). Since Ψ is finite there exists a k ≥ 0 such that Ψ ⊆ Φk which implies Φk `H1

¬(φ→ φ) contradicting the previous claim that Φk is deductively consistent. a

Claim. Φ∞ is deductively complete.




` For any arbitrary closed formula ψ, ψ occurs in the enumeration 28 at some position, sayψ ≡ ψm for some m ≥ 0. If Φm `H1 ψm then Φ∞ `H1 ψm. Otherwise Φm+1 = Φm ∪ ¬ψmand Φ∞ `H1 ¬ψm. By definition 34.5 Φ∞ is deductively complete. a

QED




The Completeness TheoremTheorem 34.8 (Godel’s Completeness Theorem). Φ |= φ impliesΦ `H1

φ. When Φ = ∅ we have that all valid formulae of L1(Σ)are theorems of H1.

2Proof: Assume Φ |= φ and suppose Φ 6`H1

φ. Then by lemma34.4 Φ∪¬∀[φ] is deductively consistent and hence possessesa model A. But that implies A Φ but A 6 φ, which by defini-tion means Φ 6|= φ, a contradiction. QED










35. Lecture 35: First-Order Theories

Lecture 35: First-Order TheoriesFriday 11 November 2011




1. (Simple) Directed Graphs

2. (Simple) Undirected Graphs

3. Irreflexive Partial Orderings

4. Irreflexive Linear Orderings

5. (Reflexive) Preorders

6. (Reflexive) Partial Orderings

7. (Reflexive) Linear Orderings

8. Equivalence Relations

9. Peano’s Postulates

10. The Theory of The Naturals

11. Notes and Explanations

12. Finite Models of Arithmetic

13. A Non-standard Model of Arithmetic

14. Z-Chains




(Simple) Directed Graphs

Σ = ; e : s2φe−irreflexivity

df= ∀x[¬e(x, x)]

ΦDGdf= φe−irreflexivity




(Simple) Undirected Graphs

Σ = ; e : s2φe−irreflexivity

df= ∀x[¬e(x, x)]

φe−symmetrydf= ∀x, y[e(x, y)→ e(y, x)]

ΦUGdf= φe−irreflexivity, φe−symmetry

Equivalently φ′e−symmetrydf= ∀x, y[e(x, y)↔ e(y, x)] (see exer-

cise 20.1.12) may be used in place of φe−symmetry.




Irreflexive Partial Orderings

Σ = ;<φ<−irreflexivity

df= ∀x[¬(x < x)]

φ<−transitivitydf= ∀x, y, z[((x < y) ∧ (y < z))→ (x < z)]

ΦIPOdf= φ<−irreflexivity, φ<−transitivity




Irreflexive Linear Orderings

Σ = ;<φ<−irreflexivity

df= ∀x[¬(x < x)]

φ<−transitivitydf= ∀x, y, z[((x < y) ∧ (y < z))→ (x < z)]

φ<−trichotomydf= ∀x, y[(x < y) ∨ (x = y) ∨ (y < x)]

ΦILOdf= φ<−irreflexivity, φ<−transitivity,φ<−trichotomy




(Reflexive) Preorders

Σ = ;≤ : s2φ≤−reflexivity

df= ∀x[x ≤ x]

φ≤−transitivitydf= ∀x, y, z[((x ≤ y) ∧ (y ≤ z))→ (x ≤ z)]

ΦPredf= φ≤−reflexivity, φ≤−transitivity




(Reflexive) Partial Orderings


df= ∀x[x ≤ x]


φ≤−antisymmetrydf= ∀x, y[((x ≤ y) ∧ (y ≤ x))→ x = y]

ΦPOdf= φ≤−reflexivity, φ≤−transitivity,φ≤−antisymmetry




(Reflexive) Linear Orderings


df= ∀x[x ≤ x]


φ≤−dichotomydf= ∀x, y[(x ≤ y) ∨ (y ≤ x)]

ΦLOdf= φ≤−reflexivity, φ≤−transitivity, φ≤−dichotomy




Equivalence Relations

Σ = ;∼ : s2φ∼−reflexivity

df= ∀x[x ∼ x]

φ∼−symmetrydf= ∀x, y[(x ∼ y)→ (y ∼ x)]

φ∼−transitivitydf= ∀x, y, z[((x ∼ y) ∧ (y ∼ z))→ (x ∼ z)]

ΦEquivdf= φ∼−reflexivity, φ∼−symmetry, φ∼−transitivity




Peano’s PostulatesP1. 0 is a natural number.P2. If x is a natural number then x+1 (called the successor of x)

is a natural number.P3. 0 6= x+1 for any natural number x.P4. x+1 = y+1 implies x = y

P5. Let P be a property that may or may not hold of every nat-ural number. IfBasis. 0 has the property P andInduction Step. whenever a natural number x has the prop-

erty P , x+1 also has the property Pthen all natural numbers have the property P .




The Theory of The Naturals

ΣS = 0 :−→ s, +1 : s −→ s; = : s2φ0−notsuccessor

df= ∀x[¬(x+1 = 0)]

φ+1−injectivedf= ∀x∀y[x+1 = y+1→ x = y]

φ6=0→successordf= ∀y[¬(y = 0)→ ∃x[y = x+1]]

φ(+1)n−distinctdf= ∀x[¬(x(+1)n = x)]

Φ(+1)∞−distinctdf= φ(+1)n−distinct | n > 0

ΦSdf= φ0−notsuccessor, φ+1−injective, φ6=0→successor∪Φ(+1)∞−distinct




Notes and Explanations1. x(+1)n denotes the n-fold application of +1 to x.2. φ6=0→successor says that every “non-zero” element must have

a “predecessor”.3. NS = 〈N,ΣS〉 is a model of the axioms ΦS.4. The infinite collection of axioms Φ(+1)n−distinct is necessary

to obtain models that are countable.5. The axioms Φ(+1)∞−distinct ensure that there are no finite

models of ΦS.




Finite Models of Arithmetic1. If the infinite collection Φ(+1)∞−distinct is replaced by a finite

collection for some m > 0 i.e.

Φ(+1)m>n>0−distinct = φ(+1)n−distinct | m > n > 0

then both finite and countable models are possible.2. If in addition to Φ(+1)m>n>0−distinct we also include the axiom

ψmodulo mdf= ∀x[x(+1)m = x]

we get models ZSm = 〈Zm,ΣS〉 for the integers modulo mand there are no infinite models.




A Non-standard Model of ArithmeticConsider the model NS = 〈N,ΣS〉 of the axioms of numbertheory.•We add a new element 0′ 6= 0.• This implies adding an infinite number of new elements 0(+1)n

one for each n > 0. For simplicity let us call these elements1′, 2′, 3′, . . .. Each of these new elements is different from ev-ery element in N.• Since 0′ 6= 0, it must have a “predecessor” say −1′

which again leads to the addition of all the elements−2′,−3′,−3′, . . . each of which is distinct and different fromall other elements. Let us call this set of elements Z′.

N′S = 〈N ∪ Z′,ΣS〉 is a model of ΦS and is said to be non-standard.




Z-Chains• Z′ is called a Z-Chain.•N′S = 〈N ∪ Z′,ΣS〉 is also a countable model of the axioms

ΦS.• Further NS and N′S are not isomorphic•We could add a countable number of distinct Z-chains, Z′′,Z′′′, Z′′′′, etc. to obtain other distinct and mutually non-isomorphic models.• Each of the models obtained above is a also a countable

model of ΦS.• Each of these models is also non-standard.










36. Lecture 36: Towards Logic Programming

Lecture 36: Towards LogicProgramming

Tuesday 15 November 2011




1. Reversing the Arrow

2. Horn Clauses

3. Goal clauses

4. Logic Programs

5. Sorting in Logic

6. Prolog: Selection Sort

7. Prolog: Merge Sort

8. Prolog: Quick Sort

9. Prolog: SEND+MORE=MONEY

10. Prolog: Naturals




Reversing the ArrowLet

φ← ψdf= ψ → φ

Consider any clause C = π1, . . . , πp ∪ ¬ν1, . . . ,¬νn whereπi, 1 ≤ i ≤ p are positive literals and ¬νj, 1 ≤ j ≤ n are thenegative literals. Then we have

C ⇔ ~∀[(∨

1≤i≤p πi) ∨ (∨

1≤j≤n¬νj)]⇔ ~∀[(

∨1≤i≤p πi) ∨ ¬(

∧1≤j≤n νj)]

⇔ ~∀[(∧

1≤j≤n νj)→ (∨

1≤i≤p πi)]

≡ ~∀[(∨

1≤i≤p πi)← (∧

1≤j≤n νj)]df= π1, . . . , πp← ν1, . . . , νn




Horn ClausesDefinition 36.1 Given a clause

Cdf= π1, . . . , πp← ν1, . . . , νn

• Then C is a Horn clause if 0 ≤ p ≤ 1.•C is called a

– program clause or rule clause if p = 1,– fact or unit clause if p = 1 and n = 0,– goal clause or query if p = 0,• Each νj is called a sub-goal of the goal clause.




Goal clausesGiven a goal clause

Gdf= ← ν1, . . . , νn⇔ ~∀[¬ν1 ∨ . . . ∨ ¬νn]

⇔ ¬~∃[ν1 ∧ . . . ∧ νn]

If ~y = FV (ν1 ∧ . . . ∧ νn) then the goal is to prove that thereexists an assignment to ~y which makes ν1 ∧ . . . ∧ νn true.




Logic ProgramsDefinition 36.2 A logic program is a finite set of Horn clauses,i.e. it is a set of rules P = h1, . . . , hk, k ≥ 0 withhl ≡ πl ← νl1, . . . , ν

lnl

, for 0 ≤ l ≤ k. πl is called the head ofthe rule and νl1, . . . , ν

lnl

is the body of the rule.

Given a logic program P and a goal clause G = ν1, . . . , νn thebasic idea is to show that

P ∪ G is unsatisfiable⇔ ~∀[¬ν1 ∨ · · · ∨ ¬νn] is a logical consequence of P⇔ ~∃[ν1 ∧ · · · ∧ νn] is a logical consequence of P




Sorting in Logicsort(x, y) ← perm(x, y), sorted(y)sorted(nil) ←sorted(x.nil) ←sorted(x.y.z) ← lesseq(x, y), sorted(y.z)lesseq(x, x) ←lesseq(x, y) ← x < yperm(nil, nil) ←perm(x.y, u.v) ← delete(u, x.y, z), perm(z, v)delete(x, x.y, y) ←delete(x, y.z, y.w) ← delete(x, z, w)

← sort([2, 8,−1, 10, 4, 2], x)




Prolog: Selection Sorts e l e c t S o r t (X, Y) :− permutat ion (X, Y) ,

sor ted (Y ) .sor ted ( [ ] ) .sor ted (H . [ ] ) .sor ted (F .S . T ) :− lesseq (F , S) ,

sor ted (S . T ) .

lesseq (F , S) :− F=S.lesseq (F , S) :− F<S.

permutat ion ( [ ] , [ ] ) .permutat ion (H. T , F .R) :− de le te (F , H. T , Z ) ,

permutat ion (Z , R ) .

de le te (H, H. T , T ) .de le te (X, H. T , H.U):− de le te (X, T , U ) .




Prolog: Merge SortmergeSort ( [ ] , [ ] ) .mergeSort (H . [ ] , H . [ ] ) .mergeSort (F .S . T , S a l l ) :− s p l i t (F .S . T , Le f t , R ight ) ,

mergeSort ( Le f t , S l e f t ) ,mergeSort ( Right , S r i g h t ) ,merge ( S l e f t , S r igh t , S a l l ) .

s p l i t ( [ ] , [ ] , [ ] ) .s p l i t (H . [ ] , H . [ ] , [ ] ) .s p l i t (F .S . T , F .U, S .V) :− s p l i t (T , U, V ) .merge ( [ ] , L , L ) .merge ( L , [ ] , L ) .merge (F .B, H. T , F .U) :− F=<H, merge (B, H. T , U ) .merge (F .B, H. T , H.V) :− H<F , merge (F .B, T , V ) .




Prolog: Quick Sortq u i c k so r t ( [ ] , [ ] ) .q u i c k so r t (H . [ ] , H . [ ] ) .q u i c k so r t (H. T , S) :− p a r t i t i o n (H, T , L , G) ,

q u i c k so r t ( L , Ls ) ,q u i c k so r t (G, Gs) ,append ( Ls , H . [ ] , Lsh ) ,append ( Lsh , Gs, S ) .

p a r t i t i o n (M, [ ] , [ ] , [ ] ) .p a r t i t i o n (M, H. T , H. Lesser , Greater ) :− H=<M,

p a r t i t i o n (M, T , Lesser , Greater ) .p a r t i t i o n (M, H. T , Lesser , H. Greater ) :− M<H,

p a r t i t i o n (M, T , Lesser , Greater ) .append ( [ ] , L , L ) .append (H. T , L , H.A) :− append (T , L , A ) .




Prolog: SEND+MORE=MONEYsmm :− L = [S,E,N,D,M,O,R,Y ] ,

D i g i t s = [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ] ,a s s i g n d i g i t s ( L , D i g i t s ) ,M > 0 , S > 0 ,

1000∗S + 100∗E + 10∗N + D +1000∗M + 100∗O + 10∗R + E =:=

10000∗M + 1000∗O + 100∗N + 10∗E + Y,w r i t e (’ ’ ) , w r i t e (S) , w r i t e (E) , w r i t e (N) , w r i t e (D) , n l ,w r i t e (’ + ’ ) , w r i t e (M) , w r i t e (O) , w r i t e (R) , w r i t e (E) , n l ,w r i t e (’ -----’ ) , n l ,w r i t e (’= ’ ) , w r i t e (M) , w r i t e (O) , w r i t e (N) , w r i t e (E) , w r i t e (Y) , n l .

s e l e c t (Z , [ Z |R] , R ) .s e l e c t (Z , [Y | Zs ] , [Y |Ys ] ) : − s e l e c t (Z , Zs , Ys ) .

a s s i g n d i g i t s ( [ ] , L i s t ) .a s s i g n d i g i t s ( [D |Ds ] , L i s t ):− s e l e c t (D, L i s t , NewList ) ,

a s s i g n d i g i t s (Ds , NewList ) .




Prolog: Naturalsi s n f ( z ) .i s n f ( s (X ) ) :− i s n f (X ) .r e w r i t e (X, X) :− i s n f (X ) .r e w r i t e ( s (X) , s (Y ) ) :− r e w r i t e (X, Y) , i s n f (Y ) .r e w r i t e ( a ( z , Y) , Y) :− i s n f (Y ) .r e w r i t e ( a (Y, z ) , Y) :− i s n f (Y ) .r e w r i t e ( a ( s (X) , Y) , s (Z ) ) :− r e w r i t e ( a (X,Y) , Z ) .r e w r i t e ( a (X, s (Y ) ) , s (Z ) ) :− r e w r i t e ( a (X,Y) , Z ) .r e w r i t e ( a (X,Y) , Z ) :− r e w r i t e (X,U) , r e w r i t e (Y,V) , r e w r i t e ( a (U,V) , Z ) .even ( z ) .even ( s ( s (X ) ) ) :− r e w r i t e (X,Y) , even (Y ) .odd (X) :− not even (X ) . % negat ion as f a i l u r e/∗ r e w r i t e ( a ( a ( s ( z ) , s ( s ( z ) ) ) , a ( s ( z ) , s ( s ( z ) ) ) ) , X ) .X = s ( s ( s ( s ( s ( s ( z ) ) ) ) ) ) ∗ /




/∗ Imp lemen t ing double−ended queues t h r o u g h c o n s t r u c t o r s ∗ /deq ( n u l l q ) .deq ( fnq (A, D) ) :− i n t e g e r (A) , deq (D ) .deq ( rnq (B , D) ) :− i n t e g e r (B) , deq (D ) .

n o n n u l l ( fnq (A, D) ) :− i n t e g e r (A) , deq (D ) .n o n n u l l ( rnq (B , D) ) :− i n t e g e r (B) , deq (D ) .

deq ( fdq (D) ) :− n o n n u l l (D ) .deq ( rdq (D) ) :− n o n n u l l (D ) .

n f ( n u l l q ) .n f ( fnq (A, D) ) :− i n t e g e r (A) , n f (D ) .

r e w r i t e (D, D) :− nf (D ) .%i n d u c t i o n s t e p f o r normal formsr e w r i t e ( fnq (A, D) , fnq (A, E ) ) :− i n t e g e r (A) , r e w r i t e (D, E ) .

% f o r a l l forms o t h e r t h a n normal formsr e w r i t e ( rnq (B , n u l l q ) , fnq (B , n u l l q ) ) :− i n t e g e r (B ) . % b a s i s o f i n d u c t i o nr e w r i t e ( rdq ( fnq (A, n u l l q ) ) , n u l l q ) . % b a s i s o f i n d u c t i o n

% r e w r i t e ( fdq ( fnq (A, n u l l q ) ) , n u l l q ) f o l l o w s from t h e more g e n e r a l r e w r i t er e w r i t e ( fdq ( fnq (A, D) ) , E):− i n t e g e r (A) , r e w r i t e (D, E ) . % fdq f o r a l l n o n n u l l

r e w r i t e ( rnq (B , fnq (A, D) ) , fnq (A, E ) ) :− % f o r rnq on normal formsi n t e g e r (A) , i n t e g e r (B) ,r e w r i t e (D, F ) ,




r e w r i t e ( rnq (B , F ) , E ) , n f ( E ) .

r e w r i t e ( rnq (B , D) , E ) :− % f o r rnq on o t h e r formsi n t e g e r (B) ,r e w r i t e (D, F ) ,r e w r i t e ( rnq (B , F ) , E ) .

r e w r i t e ( rdq ( fnq (A, D) ) , fnq (A, E ) ) :− % f o r rdq on normal formsi n t e g e r (A) ,r e w r i t e (D, F ) , n o n n u l l ( F ) ,r e w r i t e ( rdq ( F ) , E ) .

r e w r i t e ( rdq (D) , E ) :− % f o r rdq on o t h e r formsr e w r i t e (D, F ) , r e w r i t e ( rdq ( F ) , E ) .

% r e w r i t e ( rdq ( rnq (B , D) ) , D) f o l l o w s by i n d u c t i o n from t h e v a r i o u s r e w r i t e s abovefv ( fnq (A, D) , D):− i n t e g e r (A) , deq (D ) . % v a l u e a t t h e f r o n t o f t h e dequeuerv ( fnq (A, n u l l q ) , B) :− i n t e g e r (A) , A=B .rv ( fnq (A, D) , B) :− i n t e g e r (A) , r e w r i t e (D, E ) , rv ( E , B ) .rv (D, B) :− r e w r i t e (D, E ) , rv ( E , B ) .

/∗ T e s t i n g% R e s t o r i n g f i l e / u s r / l o c a l / l i b / Yap / s t a r t u pYAP v e r s i o n Yap−5 .1 .1

?− % r e c o n s u l t i n g / home / sak / p r o l o g / deques . P . . .% r e c o n s u l t e d / home / sak / p r o l o g / deques . P i n module use r , 0 msec 4096 b y t e s

yes?− r e w r i t e ( fnq ( 2 , fnq ( 1 , n u l l q ) ) , X ) .




X = fnq ( 2 , fnq ( 1 , n u l l q ) ) ?yes

?− rv ( fnq ( 1 , fnq ( 2 , n u l l q ) ) , B ) .B = 2 ?yes

?− rv ( rnq ( 3 , fnq ( 1 , fnq ( 2 , n u l l q ) ) ) , B ) .B = 3 ?yes

?− r e w r i t e ( rnq ( 4 , fdq ( fnq ( 1 , fnq ( 2 , rnq ( 3 , n u l l q ) ) ) ) ) , X ) .X = fnq ( 2 , fnq ( 3 , fnq ( 4 , n u l l q ) ) ) ?yes

?− rv ( rnq ( 4 , fdq ( fnq ( 1 , fnq ( 2 , rnq ( 3 , n u l l q ) ) ) ) ) , B ) .B = 4 ?yes

?− rv ( rdq ( rnq ( 4 , fdq ( fnq ( 1 , fnq ( 2 , rnq ( 3 , n u l l q ) ) ) ) ) ) , B ) .B = 3 ?yes∗ /







37. Lecture 37: Verification of Imperative Programs

Lecture 37: Verification of ImperativePrograms





1. The WHILE Programming Language

2. Programs As State Transformers

3. The Semantics of WHILE

4. Programs As Predicate Transformers

5. Correctness Assertions

6. Total Correctness of Programs

7. Examples: Factorial 1

8. Examples: Factorial 2




The WHILE Programming LanguageLet Σ be any signature. Then the programming languageWH(Σ) is defined by the following BNF.

P ,Q ::= ε | x := t | P ;Q | [P ] |χ?P : Q |χ?P

where χ ∈ QF(Σ).




Programs As State TransformersA program is merely a state transformer which takes a valuationvA of variables and yields another valuation v′A.

P

vA v′A




The Semantics of WHILELet A be a Σ-algebra. Let VA = vA | vA : V −→ |A| bethe set of all valuations (also called states). The meaning of aprogram P is given byMAJP K : VA −→ VA

MAJεKvAdf= vA

MAJx := tKvAdf= vA[x := VAJtKvA]

MAJ[P ]KvAdf= MAJP KvA

MAJP ;QKvAdf= (MAJQK MAJP K)vA

MAJχ?P : QKvAdf=

MAJP KvA if TAJχKvA = 1MAJQKvA if TAJχKvA = 0

MAJχ?PKvAdf=

MAJP ; χ?PKvA if TAJχKvA = 1vA if TAJχKvA = 0




Programs As Predicate TransformersWe may also view a program as a transformer of properties.Consider a predicate φ to represent a set of possible valuationsi.e. VA(φ) = vA | (A, vA) φ. Then a program transformsa state satisfying φ into a state satisfying some other predicateψ.

v v′

φψ

P




Correctness AssertionsDefinition 37.1 A partial correctness assertion (also called aHoare-triple) is a triple of the form φ P ψ where φ is aformula called the precondition, P is a program and ψ is thepostcondition.

Definition 37.2• φ P ψ holds in a state vA (denoted (A, vA) φ P ψ),

if (A, vA) φ andMAJP KvA = v′A implies (A, v′A) ψ

• φ P ψ is valid in A, (denoted A |= φ P ψ) if(A, vA) φ P ψ for every state vA.




Total Correctness of ProgramsDefinition 37.3 A total correctness assertion is a triple of theform [φ] P [ψ] where φ is a formula called the precondition, P isa program and ψ is the postcondition.

Definition 37.4• [φ] P [ψ] holds in a state vA (denoted (A, vA) [φ] P [ψ]),

if (A, vA) φ implies for some v′A MAJP KvA = v′A and(A, v′A) ψ.• [φ] P [ψ] is valid in A, (denoted A |= φ P ψ) if

(A, vA) [φ] P [ψ] for every state vA.




Examples: Factorial 1Let Σ ⊇ Z ∪ ! : s,+,−, ∗ : s2 −→ s; =, > : s2.

Example 37.5 Let P1df= p := 1; ¬(x = 0)?[p := p ∗ x;x := x− 1].

Let Z = 〈Z,Σ〉. Then1. Z |= x = x0 P1 p = x0!2. However Z 6|= [x = x0] P1 [p = x0!] since P1 will not terminate

for negative values of x.




Examples: Factorial 2Example 37.6 Let P2

df= p := 1; x > 0?[p := p ∗ x;x := x− 1].

Let Z = 〈Z,Σ〉. Then1. Z |= x = x0 ≥ 0 P p = x0!2. Z |= [x = x0 ≥ 0] P [p = x0!]

3. A more “technically complete” specification isZ |= [x = x0] P [(x0 ≥ 0→ p = x0!) ∧ (x0 < 0→ p = 1)]


http://www.cse.iitd.ernet.in/~sak/courses/icp/new-slides.pdf



Exercise 37.1

1. For any program P defined on a signature Σ what do the following correctness formulae mean?

(a) > P >(b) > P ⊥(c) ⊥ P >(d) ⊥ P ⊥(e) [>] P [>]

(f) [>] P [⊥]

(g) [⊥] P [>]

(h) [⊥] P [⊥]

2. Which of the correctness formulae given in problem 1 are

(a) always valid?

(b) always unsatisfiable?







38. Lecture 38: Verification of WHILE Programs

Lecture 38: Verification of WHILEPrograms

Friday 18 November 2011




1. Proof Rule: Epsilon

2. Proof Rule: Assignment

3. Proof Rule: Composition

4. Proof Rule: The Conditional

5. Proof Rule: The While Loop

6. The Consequence Rule

7. Proof Rules for Partial Correctness

8. Example: Factorial 1

9. Towards Total Correctness

10. Termination and Total Correctness

11. Example: Factorial 2

12. Notes on Example: Factorial

13. Example: Factorial 2 Made Complete

14. An Open Problem: Collatz




Proof Rule: Epsilonφ φ

ε

ε. φ ε φ




Proof Rule: Assignment

v v′

φψ

x := t

:= . φ x := t ψ (φ ≡ t/xψ)




Proof Rule: Composition

φ ψ

v′v v′′

ξ

P Q

; .φ P ξξ Q ψφ P ;Q ψ




Proof Rule: The Conditional

Q

φ ∧ χ

φ ∧ ¬χ

ψ

P

? : .φ ∧ χ P ψφ ∧ ¬χ Q ψφ χ?P : Q ψ




Proof Rule: The While Loop

P

ι ∧ χ

ι ∧ ¬χ

. ι ∧ χ P ιι χ?P ι ∧ ¬χ

ι in the while rule is called the loop invariant.




The Consequence Rule

P

φ′

φ ψ′

ψ

⇒ .

φ⇒ φ′

φ′ P ψ′ψ′⇒ ψφ P ψ




Proof Rules for Partial Correctnessε. φ ε φ := . t/xψ x := t ψ [].

φ P ψφ [P ] ψ

. ι ∧ χ P ιι χ?P ι ∧ ¬χ

; .φ P ξξ Q ψφ P ;Q ψ

? : .φ ∧ χ P ψφ ∧ ¬χ Q ψφ χ?P : Q ψ

⇒ .

φ⇒ φ′

φ′ P ψ′ψ′⇒ ψφ P ψ

ι in the while rule is called the loop invariant.




Example: Factorial 1x = x0

p := 1; x = x0 ∧ p = 1⇒ (x0 = x) ∧ (x ≥ 0→ p ∗ x! = x0!)⇒ φ0 ∧ (x ≥ 0→ p ∗ x! = x0!) ≡ ι

¬(x = 0)? φ0 ∧ (x ≥ 0) ∧ (p ∗ x! = x0!) ∧ ¬(x = 0)⇒ φ0 ∧ (x > 0) ∧ (p ∗ x! = x0!)

[p := p ∗ x; φ0 ∧ (x > 0) ∧ (p ∗ (x− 1)! = x0!)∧x := x− 1] φ0 ∧ (x ≥ 0) ∧ (p ∗ x! = x0!)

⇒ ι ι ∧ (x = 0)

⇒ φ0 ∧ (x = 0) ∧ (p ∗ x! = x0!)⇒ (x0 ≥ 0→ p = x0!)

where φ0 ≡ (x0 ≥ 0)




Towards Total Correctness1. The only construct which may not terminate is the while loop.2. Termination of the while loop: Define a “measure” called the

bound function, β : Dom(β) −→ W where• 〈W,<〉 is a well-ordered set (a set with no infinite descend-

ing sequences w > w′ > w′′ > · · ·) with a least element 0such that w < 0 does not hold true for any w ∈ W .•Dom(β) is the tuple of possible values of the program vari-

ables (~v).•Often 〈W,<〉 = 〈N, <〉• ι ∧ χ⇒ β(~v) > 0 and β(~v) = 0⇒ ι ∧ ¬χ• Each execution of the body of the loop decreases the value

of the bound function.




Termination and Total Correctnessε!

[φ] ε [φ]:=!

[t/xψ] x := t [ψ][]!

[φ] P [φ][φ] [P ] [φ]

; ![φ] P [ξ][ξ] Q [ψ]

[φ] P ;Q [ψ]? :!

[φ ∧ χ] P [ψ][φ ∧ ¬χ] Q [ψ][φ] χ?P : Q [ψ]

⇒!

φ⇒ φ′

[φ′] P [ψ′]ψ′⇒ ψ[φ] P [ψ]

! [ι ∧ χ ∧ β(~v) = b0 > 0] P [ι ∧ β(~v) < b0][ι] χ?P [ι ∧ ¬χ]

where• β is a bound function and• ι is the loop invariant.

c.f. Proof Rules for Partial Correctness




Example: Factorial 2[x = x0]

p := 1; [x = x0 ∧ p = 1]⇒ [(x0 = x) ∧ (x ≥ 0→ p ∗ x! = x0!) ∧ (x0 < 0→ p = 1)]⇒ [φ0 ∧ (x ≥ 0→ p ∗ x! = x0!)] ≡ [ι]

x > 0? [φ0 ∧ (x > 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x = β0 > 0][p := p ∗ x; [φ0 ∧ (x > 0) ∧ (p ∗ (x− 1)! = x0!)∧

β(x, p) = x = β0 > 0]x := x− 1] [φ0 ∧ (x ≥ 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x < β0 > 0] ]

⇒ [ι] [φ0 ∧ (x = 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x = 0]

⇒ [(x0 ≥ 0→ p = x0!) ∧ (x0 < 0→ p = 1)] ≡ [ι]where φ0 ≡ (x0 ≥ 0) ∧ (x0 < 0→ p = 1)

c.f. Partial correctness proof




Notes on Example: Factorial• The loop invariant ι is a conjunction of

– φ0 whose truth is trivially unaffected by the changes in stateproduced by the loop body, and

– the formula (x ≥ 0→ p ∗ x! = x0!) which∗ holds initially before control enters the loop,∗ holds after the condition has been checked,∗ fails to hold after the first command of the body has been

executed,∗ is restored at the end of the loop body, and∗ holds after exiting the loop

•Notice the progress of the bound function which is completelyinternal to the working of the loop and is never part of thespecification of the program.




Example: Factorial 2 Made CompleteA more complete proof including the use of the assignment clearly delineated is givenbelow.

[x = x0] ≡ [φ0]⇒ [x = x0 ∧ 1 = 1] ≡ [1/pφ1]

p := 1; [x = x0 ∧ p = 1] ≡ [φ1]⇒ [(x0 = x) ∧ (x ≥ 0→ p ∗ x! = x0!) ∧ (x0 < 0→ p = 1)]⇒ [φ0 ∧ (x ≥ 0→ p ∗ x! = x0!)] ≡ [ι]

x > 0? [φ0 ∧ (x > 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x = β0 > 0] ≡ [φ′1]⇒ [p ∗ x/pψ1]

[p := p ∗ x; [φ0 ∧ (x > 0) ∧ (p ∗ (x− 1)! = x0!) ∧ β(x, p) = x = β0 > 0] ≡ [ψ1]⇒ [x− 1/xψ2]

x := x− 1 [φ0 ∧ (x ≥ 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x < β0 > 0] ≡ [ψ2]] [ψ2]

⇒ [ι] [ι ∧ ¬(x > 0) ∧ β(x, p) = x = 0]

⇒ [φ0 ∧ (x = 0) ∧ (p ∗ x! = x0!) ∧ β(x, p) = x = 0]⇒ [(x0 ≥ 0→ p = x0!) ∧ (x0 < 0→ p = 1)] ≡ [ι]




An Open Problem: CollatzConsider the following specification where Σ includes the divand mod functions on positive integers.

Z |= [x > 0] Collatz [x = 1]

where

Collatzdf= x > 1?[x mod 2 = 0?x := x div 2 : x := 3 ∗ x + 1]










39. Lecture 39: Concluding Remarks

Lecture 39: Concluding RemarksWednesday 07 December 2011




1. Summary

2. The Limitations of Predicate Logic

3. Sortedness

4. Many-Sorted Logic: Symbols

5. Many-Sorted Signatures

6. Many-Sorted Signature: Terms

7. Many-Sorted Predicate Logic

8. Reductions




Summary1. A mathematical treatment of the essentials of reasoning2. A rigorous treatment of First-order logic3. Applications in logic and computer science(a) some elementary theorem proving(b) logic programming(c) program verification

4. Some illustrations of the power of first-order logic5. Some illustrations of the lack of distinguishability.




The Limitations of Predicate Logic1. The property of well-orderings viz.

Every subset of a well-ordered set has a least elementcannot even be expressed in FOL. It requires the power ofsecond-order logic

2. Transitive closures are not expressible3. Isomorphic models cannot be distinguished4. The existence of non-standard models implies that not all

non-isomorphic models may be distinguished either.5. The validity problem is undecidable.




Sortedness• A treatment that works mainly with a 1-sorted term algebra

often does not address the interesting problems of a mathe-matical theory e.g. the first-order theory of directed or undi-rected graphs.• To begin to address even the simplest problems of graph the-

ory requires counting and the power of the first order theoryof numbers.• The problems of second-order logic (quantification over first

order properties) may be expressed in many-sorted first or-der logic by allowing the power set of a set to be included inthe universe of discourse of a many-sorted first order logic.




Many-Sorted Logic: SymbolsWe have considered only a first-order logic of 1-sorted signa-tures. It is possible to extend it to a logic of a many-sorted sig-nature (which is what most programming languages are basedon).Definition 39.1 Given a finite nonempty set S of sorts with S =si | 1 ≤ i ≤ k, a k-sorted logic consists of• A countable set Vi of variables of sort si for each si ∈ S.• F: a countably infinite collection of function symbols;f, g, h, . . . ∈ F.• A: a countably infinite collection of atomic predicate symbols;p, q, r, . . . ∈ A and•Quantifier symbols ∀i and ∃i for each sort si ∈ S.




Many-Sorted SignaturesDefinition 39.2 Given a finite nonempty set S of sorts with S =si | 1 ≤ i ≤ k, a S-sorted signature Σ consists of a set ofstrings of the form•

f : si1, si2, . . . , sim −→ si0,m ≥ 0

for each f ∈ F,•

p : si1, si2, . . . , sin, n ≥ 0

such that there is at most one string for each f ∈ F and eachp ∈ A.• A binary equality relation =i : si

2 for some of the sorts si ∈ S.




Many-Sorted Signature: TermsDefinition 39.3 Given a S-sorted signature Σ, the set T(Σ) =⊎

1≤i≤k Ti(Σ) of Σ-terms is the disjoint union of the sets of termsTi(Σ) defined inductively such that every term is assigned a sortfrom S.

s, t, u ::= xi ∈ Vi | f (t1, . . . , tm)

where f : si1, si2, . . . , sim −→ si0 ∈ Σ,• each variable xi ∈ Vi is a term in Ti(Σ), a fact usually denotedxi : si,• f (t1, . . . , tm) ∈ Ti0(Σ), a fact usually denoted f (t1, . . . , tm) :si0.




Many-Sorted Predicate Logic• The syntax of the logic is the obvious extension to the one

defined as before.• The semantics requires a S-sorted structure with a non-

empty domain Ai for each sort.• The semantics is then a minor extension of the one for the

1-sorted case




Reductions• Second-Order logic may simply be considered a 2-sorted

logic with predicates parameterised over both individuals andsets of individuals.• An S-sorted Predicate Logic may be reduced to a 1-sorted

Predicate logic by introducing a fresh set of unary-predicatesis si (one for each sort si) to denote membership in a sortand• replace every quantified formula of the form ∀ixi[φ] by the

1-sorted formula ∀x[is si(x)→ φ] and recursively for eachquantifier,• replace every quantified formula of the form ∃ixi[φ] by the 1-

sorted formula ∃x[is si(x) ∧ φ] and recursively for each quan-tifier,










40. References

References[1] I. M. Copi. Symbolic Logic. Macmillan, London, UK, 1979.

[2] H. D. Ebbinghaus, J. Flum, and W. Thomas. Mathematical Logic. Springer-Verlag, New York, USA, 1994.

[3] H. B. Enderton. A Mathematical Introduction to Logic. Elsevier India, New Delhi, India, 2001.

[4] M. Fitting. First-Order Logic and Automated Theorem Proving. Springer-Verlag, New York, USA, 1990.

[5] M. Huth and M. Ryan. Logic in Computer Science: Modelling and Reasoning about Systems. Cambridge University Press,Cambridge, UK, 2000.

[6] John Kelly. The Essence of Logic. Prentice-Hall India, New Delhi, India, 1997.

[7] S. C. Kleene. Mathematical Logic. Dover Publications Inc., New York, USA, 1967.

[8] E. Mendelson. Introduction to Mathematical Logic. D. Van Nostrand Co. Inc., Princeton, New Jersey, USA, 1963.

[9] Anil Nerode and R. Shore. Logic for Applications. Springer-Verlag, New York, USA, 1993.

[10] R. M. Smullyan. First-Order Logic. Springer-Verlag, Berlin, Germany, 1968.

[11] V. Sperschneider and G. Antoniou. Logic: A Foundation for Computer Science. Addison-Wesley Publishing Company, Reading,UK, 1991.




Thank You!Any Questions?


Logic for Computer Sciencesak/courses/ilcs/ilcs.pp4.pdf · 2015-07-28 · 4.Objectivity in Logic 5.Formal Logic 6.Formal Logic: Applications 7.Form and Content 8.Facets of Mathematical

Documents