Lecture Notes by Leslie and Nickson 2005

Collected Lecture Slides for COMP 202

Formal Methods of Computer Science

Neil Leslie and Ray Nickson11 October, 2005

This document was created by concatenating the lecture slides into a singlesequence: one chapter per topic, and one section per titled slide.

There is no additional material apart from what was distributed in lectures.

ii

Neil Leslie and Ray Nickson assert their moral rightto be identified as the authors of this work.

c© Neil Leslie and Ray Nickson, 2005

Contents

Problems, Algorithms, and Programs 1

1 Introduction 31.1 COMP202 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Course information . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Tutorial and Marking Groups . . . . . . . . . . . . . . . . . . . . 41.5 What is Computer Science? . . . . . . . . . . . . . . . . . . . . . 41.6 Some Powerful Ideas . . . . . . . . . . . . . . . . . . . . . . . . . 51.7 Related Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.8 Lecture schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Problems and Algorithms 72.1 Problems and Algorithms . . . . . . . . . . . . . . . . . . . . . . 72.2 Problems, Programs and Proofs . . . . . . . . . . . . . . . . . . . 72.3 Describing problems . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Example: Comparing Strings . . . . . . . . . . . . . . . . . . . . 82.5 Understanding the Problem . . . . . . . . . . . . . . . . . . . . . 92.6 Designing an Algorithm . . . . . . . . . . . . . . . . . . . . . . . 92.7 Defining the Algorithm Iteratively . . . . . . . . . . . . . . . . . 102.8 Defining the Algorithm Recursively . . . . . . . . . . . . . . . . . 10

3 Two Simple Programming Languages 113.1 Imperative and Applicative Languages . . . . . . . . . . . . . . . 113.2 The Language of While-Programs . . . . . . . . . . . . . . . . . . 113.3 Comparing strings again . . . . . . . . . . . . . . . . . . . . . . . 123.4 The Applicative Language . . . . . . . . . . . . . . . . . . . . . . 123.5 Those strings again ... . . . . . . . . . . . . . . . . . . . . . . . . 133.6 Using head and tail . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Reasoning about Programs 154.1 Some Questions about Languages . . . . . . . . . . . . . . . . . . 154.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

iii

iv CONTENTS

4.4 Comparing Strings One More Time . . . . . . . . . . . . . . . . . 16

4.5 Laws for Program Verification . . . . . . . . . . . . . . . . . . . . 18

4.6 Reasoning with Assertions . . . . . . . . . . . . . . . . . . . . . . 18

I Regular Languages 21

5 Preliminaries 23

5.1 Part I: Regular Languages . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Formal languages . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.3 Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.4 String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.5 Language, Word . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.6 Operations on Strings and Languages . . . . . . . . . . . . . . . 25

5.7 A proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.8 Statement of the conjecture . . . . . . . . . . . . . . . . . . . . . 26

5.9 Base case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.10 Induction step . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.11 A theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Defining languages using regular expressions 29

6.1 Regular expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Arithmetic expressions . . . . . . . . . . . . . . . . . . . . . . . . 29

6.3 Simplifying conventions . . . . . . . . . . . . . . . . . . . . . . . 30

6.4 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.5 Operator associativity . . . . . . . . . . . . . . . . . . . . . . . . 30

6.6 Care! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.7 Defining regular expressions . . . . . . . . . . . . . . . . . . . . . 31

6.8 Simplifying conventions . . . . . . . . . . . . . . . . . . . . . . . 31

6.9 What the regular expressions describe . . . . . . . . . . . . . . . 32

6.10 Derived forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7 Regular languages 35

7.1 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7.2 Example languages . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7.3 Finite languages are regular . . . . . . . . . . . . . . . . . . . . . 37

7.4 An algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

7.5 EVEN-EVEN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

7.6 Deciding membership of EVEN-EVEN . . . . . . . . . . . . . . . 38

7.7 Using fewer states . . . . . . . . . . . . . . . . . . . . . . . . . . 38

7.8 Using even fewer states . . . . . . . . . . . . . . . . . . . . . . . 38

7.9 Uses of regular expressions . . . . . . . . . . . . . . . . . . . . . . 38

CONTENTS v

8 Finite automata 418.1 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 418.2 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 418.3 Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418.4 An automaton . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428.5 The language accepted by M1 . . . . . . . . . . . . . . . . . . . . 438.6 Another automaton . . . . . . . . . . . . . . . . . . . . . . . . . . 438.7 A pictorial representation of FA . . . . . . . . . . . . . . . . . . . 438.8 Examples of constructing an automaton . . . . . . . . . . . . . . 448.9 Constructing ML . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8.10 A machine to accept Language(11(0 + 1)∗) . . . . . . . . . . . . 458.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

9 Non-deterministic Finite automata 479.1 Non-deterministic Finite Automata . . . . . . . . . . . . . . . . . 479.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479.3 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 489.4 Just like before. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489.5 Example NFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489.6 M5 accepts 010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499.7 M5 accepts 01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499.8 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499.9 NFA’s are as powerful as FA’s . . . . . . . . . . . . . . . . . . . . 509.10 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509.11 Using the graphical representation . . . . . . . . . . . . . . . . . 509.12 FA’s are as powerful as NFA’s . . . . . . . . . . . . . . . . . . . . 519.13 Useful observations . . . . . . . . . . . . . . . . . . . . . . . . . . 519.14 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529.15 Constructing δ′ and Q′ . . . . . . . . . . . . . . . . . . . . . . . . 529.16 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529.17 NFA with Λ transitions . . . . . . . . . . . . . . . . . . . . . . . 539.18 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539.19 Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 549.20 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549.21 Graphical representation of M6 . . . . . . . . . . . . . . . . . . . 549.22 Easy theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.23 Harder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.24 δ′ the new transition function . . . . . . . . . . . . . . . . . . . . 559.25 F ′ the new set of accepting states . . . . . . . . . . . . . . . . . . 569.26 Example: an NFA equivalent to M6 . . . . . . . . . . . . . . . . 569.27 Graphical representation of M12 . . . . . . . . . . . . . . . . . . 569.28 Regular expression to NFA with Λ transitions . . . . . . . . . . . 569.29 Proof outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569.30 Base cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579.31 Induction steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579.32 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

vi CONTENTS

9.33 Proof summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

9.34 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

10 Kleene’s theorem 63

10.1 A Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

10.2 Kleene’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

11 Closure Properties of Regular Languages 65

11.1 Closure properties of regular language . . . . . . . . . . . . . . . 65

11.2 Formally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

11.3 Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

11.4 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

11.5 Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

11.6 Kleene closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

11.7 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

11.8 Summary of the proofs . . . . . . . . . . . . . . . . . . . . . . . . 67

12 Non-regular languages 69

12.1 Non-regular Languages . . . . . . . . . . . . . . . . . . . . . . . . 69

12.2 Pumping lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

12.3 Pumping lemma informally . . . . . . . . . . . . . . . . . . . . . 70

12.4 Proving the pumping lemma . . . . . . . . . . . . . . . . . . . . . 71

12.5 A non-regular language . . . . . . . . . . . . . . . . . . . . . . . 72

12.6 Pumping lemma re-cap . . . . . . . . . . . . . . . . . . . . . . . . 72

12.7 Another non-regular language . . . . . . . . . . . . . . . . . . . . 72

12.8 And another . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

12.9 Comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

13 Regular languages: summary 75

13.1 Regular languages: summary . . . . . . . . . . . . . . . . . . . . 75

13.2 Kleene’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

II Context Free Languages 77

14 Introducing Context Free Grammars 79

14.1 Beyond Regular Languages . . . . . . . . . . . . . . . . . . . . . 79

14.2 Sentences with Nested Structure . . . . . . . . . . . . . . . . . . 79

14.3 A Simple English Grammar . . . . . . . . . . . . . . . . . . . . . 80

14.4 Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

14.5 Context Free Grammars . . . . . . . . . . . . . . . . . . . . . . . 81

14.6 Formal definition of CFG . . . . . . . . . . . . . . . . . . . . . . 82

CONTENTS vii

15 Regular and Context Free Languages 8315.1 Regular and Context Free Languages . . . . . . . . . . . . . . . . 8315.2 CF ∩Reg 6= ∅ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8315.3 CF 6⊆ Reg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8415.4 CF ⊇ Reg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8415.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8515.6 Regular Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 8515.7 Parsing using Regular Grammars . . . . . . . . . . . . . . . . . . 8515.8 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8615.9 Derivations in Regular Grammars . . . . . . . . . . . . . . . . . . 8615.10Derivations in Arbitrary CFGs . . . . . . . . . . . . . . . . . . . 8615.11Parse Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8715.12Derivations and parse trees . . . . . . . . . . . . . . . . . . . . . 8815.13Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

16 Normal Forms 8916.1 Lambda Productions . . . . . . . . . . . . . . . . . . . . . . . . . 8916.2 Eliminating Λ Productions . . . . . . . . . . . . . . . . . . . . . 8916.3 Circularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9016.4 Λ Productions on the Start Symbol . . . . . . . . . . . . . . . . . 9116.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9116.6 Unit Productions . . . . . . . . . . . . . . . . . . . . . . . . . . . 9216.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

17 Recursive Descent Parsing 9317.1 Recognising CFLs (Parsing) . . . . . . . . . . . . . . . . . . . . . 9317.2 Top-down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 9317.3 Top-down Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 9417.4 Recursive Descent Parsing . . . . . . . . . . . . . . . . . . . . . . 9417.5 Building a Parser for Nested Lists . . . . . . . . . . . . . . . . . 9417.6 Parser for Nested Lists . . . . . . . . . . . . . . . . . . . . . . . . 9517.7 Building a Parse Tree . . . . . . . . . . . . . . . . . . . . . . . . 9617.8 Parser for Nested Lists . . . . . . . . . . . . . . . . . . . . . . . . 9617.9 LL(1) Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . 9717.10First and Follow sets . . . . . . . . . . . . . . . . . . . . . . . . . 9817.11Transforming CFGs to LL(1) form . . . . . . . . . . . . . . . . . 98

18 Pushdown Automata 10118.1 Finite and Infinite Automata . . . . . . . . . . . . . . . . . . . . 10118.2 Pushdown Automata . . . . . . . . . . . . . . . . . . . . . . . . . 10218.3 Formal Definition of PDA . . . . . . . . . . . . . . . . . . . . . . 10318.4 Deterministic and Nondeterministic PDAs . . . . . . . . . . . . . 10418.5 CFG ⊆ PDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10518.6 Top-Down construction . . . . . . . . . . . . . . . . . . . . . . . 10518.7 S → aS | T T → b | bT . . . . . . . . . . . . . . . . . . . . . . . 10518.8 Bottom-Up construction . . . . . . . . . . . . . . . . . . . . . . . 106

viii CONTENTS

18.9 S → aS | T ; T → b | bT . . . . . . . . . . . . . . . . . . . . . . 10618.10PDA ⊆ CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10618.11PDA to CFG Formally . . . . . . . . . . . . . . . . . . . . . . . . 10718.12Example (L6 from slide 69) . . . . . . . . . . . . . . . . . . . . . 107

19 Non-CF Languages 10919.1 Not All Languages are Context Free . . . . . . . . . . . . . . . . 10919.2 Context Sensitive Grammars . . . . . . . . . . . . . . . . . . . . 10919.3 Example (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11019.4 Example (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11019.5 Generating the empty string . . . . . . . . . . . . . . . . . . . . . 11019.6 CFL ⊂ CSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

20 Closure Properties 11320.1 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 11320.2 Union of Context Free Languages is Context Free . . . . . . . . . 11320.3 Concatenation of CF Languages is Context Free . . . . . . . . . 11420.4 Kleene Star of a CF Language is CF . . . . . . . . . . . . . . . . 11520.5 Intersections and Complements of CF Languages . . . . . . . . . 115

21 Summary of CF Languages 11721.1 Why context-free? . . . . . . . . . . . . . . . . . . . . . . . . . . 11721.2 Phrase-structure grammars . . . . . . . . . . . . . . . . . . . . . 11721.3 Special cases of Phrase-structure grammars . . . . . . . . . . . . 11821.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11821.5 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11921.6 Recursive descent and LL(1) grammars . . . . . . . . . . . . . . 11921.7 Pushdown automata . . . . . . . . . . . . . . . . . . . . . . . . . 11921.8 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11921.9 Constructions on CFLs . . . . . . . . . . . . . . . . . . . . . . . . 120

III Turing Machines 121

22 Turing Machines I 12322.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12322.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12422.3 A crisis in the foundations of mathematics . . . . . . . . . . . . . 12422.4 The computable functions: Turing’s approach . . . . . . . . . . . 12422.5 What a computer does . . . . . . . . . . . . . . . . . . . . . . . . 12422.6 What have we achieved? . . . . . . . . . . . . . . . . . . . . . . . 12522.7 The computable functions: Church’s approach . . . . . . . . . . . 12522.8 First remarkable fact . . . . . . . . . . . . . . . . . . . . . . . . . 12622.9 Church-Turing thesis . . . . . . . . . . . . . . . . . . . . . . . . . 12722.10Second important fact . . . . . . . . . . . . . . . . . . . . . . . . 12722.11The universal machine . . . . . . . . . . . . . . . . . . . . . . . . 127

CONTENTS ix

22.12Third important fact . . . . . . . . . . . . . . . . . . . . . . . . . 128

23 Turing machines II 129

23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

23.2 Informal description . . . . . . . . . . . . . . . . . . . . . . . . . 130

23.3 How Turing machines behave: a trichotomy . . . . . . . . . . . . 130

23.4 Informal example . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

23.5 Towards a formal definition . . . . . . . . . . . . . . . . . . . . . 131

23.6 Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

23.7 The head and the tape . . . . . . . . . . . . . . . . . . . . . . . . 132

23.8 The states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

23.9 The transition function . . . . . . . . . . . . . . . . . . . . . . . 132

23.10Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

23.11Formal definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

23.12Representing the computation . . . . . . . . . . . . . . . . . . . . 133

23.13A simple machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

23.14Graphical representation of M1 . . . . . . . . . . . . . . . . . . . 134

23.15Some traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

23.16Turing machines can accept regular languages . . . . . . . . . . . 134

23.17Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

23.18Another machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

23.19Graphical representation of M2 . . . . . . . . . . . . . . . . . . . 135

23.20A trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

24 Turing Machines III 137

24.1 An example machine . . . . . . . . . . . . . . . . . . . . . . . . . 137

24.2 Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

24.3 Computable and computably enumerable languages . . . . . . . . 138

24.4 Deciders and recognizers . . . . . . . . . . . . . . . . . . . . . . . 138

24.5 A decidable language . . . . . . . . . . . . . . . . . . . . . . . . . 139

24.6 Another decidable language . . . . . . . . . . . . . . . . . . . . . 139

24.7 A corollary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

24.8 Even more decidable sets . . . . . . . . . . . . . . . . . . . . . . 140

24.9 An undecidable set . . . . . . . . . . . . . . . . . . . . . . . . . . 140

25 Turing Machines IV 141

25.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

25.2 AFA is decidable . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

25.3 APDA is decidable . . . . . . . . . . . . . . . . . . . . . . . . . . 142

25.4 The halting problem . . . . . . . . . . . . . . . . . . . . . . . . . 142

25.5 Other undecidable problems . . . . . . . . . . . . . . . . . . . . . 143

25.6 Closure properties . . . . . . . . . . . . . . . . . . . . . . . . . . 143

25.7 Computable and computably enumerable languages . . . . . . . . 144

25.8 A language which is not c.e. . . . . . . . . . . . . . . . . . . . . . 145

x CONTENTS

26 Turing Machines V 14726.1 A hierarchy of classes of language . . . . . . . . . . . . . . . . . . 14726.2 A hierarchy of classes of grammar . . . . . . . . . . . . . . . . . . 14726.3 Grammars for natural languages . . . . . . . . . . . . . . . . . . 14826.4 A hierarchy of classes of automaton . . . . . . . . . . . . . . . . . 14826.5 Deterministic and nondeterministic automata . . . . . . . . . . . 14926.6 Nondeterminstic Turing Machines . . . . . . . . . . . . . . . . . . 14926.7 NTM=TM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14926.8 More variations on Turing Machines . . . . . . . . . . . . . . . . 150

27 Summary of the course 15127.1 Part 0 – Algorithms, and Programs . . . . . . . . . . . . . . . . . 15127.2 Part I – Formal languages and automata . . . . . . . . . . . . . . 15127.3 Part II – Context-Free Languages . . . . . . . . . . . . . . . . . . 15227.4 Part III – Turing Machines . . . . . . . . . . . . . . . . . . . . . 15227.5 COMP 202 exam . . . . . . . . . . . . . . . . . . . . . . . . . . . 15227.6 What next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Part

Problems, Algorithms, andPrograms

1

Chapter 1

Introduction

1.1 COMP202

Formal Methods of Computer Science

COMP202 introduces a selection of topics, focusing on the use of formal no-tations and formal models in the specification, design and analysis of programs,languages, and machines.

The focus is on language: syntax, semantics, and translation.

Covers fundamental aspects of Computer Science which have many impor-tant applications and are essential to advanced study in computer science.

1.2 Course information

Lecturer

• Ray Nickson, CO 442, Phone 4635657,Email [email protected]

Tutors

• Alan Ta’alolo, CO 242, Phone 4636750,Email [email protected]

• others to be advised

Web pagehttp://www.mcs.vuw.ac.nz/Courses/COMP202/

LecturesMonday, Wednesday and Thursday, 4.10-5pm in Hunter 323.

3

4 CHAPTER 1. INTRODUCTION

TutorialsOne hour per week.

Text book“Introduction to Computer Theory” (2nd Edition), by Daniel Cohen, Pub-lished by Wiley in 1997 (available from the University bookshop, approx.cost $135).

1.3 AssessmentProblem sets (10%)

The basis for tutorial discussion, and containing questions to write up formarking.

Programming projects (20%)Due in weeks 6 and 11.

Test (15%)Two hours, on Thursday 1 September. Exact time to be confirmed.Covers material from the first half of the course.

Exam (55%)Three hours.

To pass COMP 202 you must achieve at least 40% in the finalexam, and gain a total mark of at least 50%.

1.4 Tutorial and Marking Groups

Tutorials start NEXT WEEK (on Tuesday 12 July).There will be five groups for tutorials and marking; we expect all tutorials

to be held in Cotton 245 (subject to confirmation).Group Time1 Tuesday 12-12.50pm2 Tuesday 2.10-3pm3 Thursday 2.10-3pm4 Friday 11-11.50am5 Friday 12-12.50pm DELETED

Please sign up for a group on the sheets posted outside Cotton 245.You need to sign up for a group even if you don’t intend to attend

tutorials, as these will also be your marking groups.

1.5 What is Computer Science?

Computer Science involves (amongst other things):

• Describing complex systems, structures and processes.

1.6. SOME POWERFUL IDEAS 5

• Reasoning about such descriptions, to establish desired properties.

• Animating/executing descriptions to obtain resulting behaviour.

• Transforming one description into another.

These are common threads in much of Computer Science.

Understanding them will form the main aim of this course.

1.6 Some Powerful Ideas

Computer Science has produced several powerful ideas to address these kindsof problems.

Languages and notations: programming languages, command languages, datadefinition languages, class diagrams, . . .

Mathematical models: graphs to model networks, trees to model programstructures, . . .

COMP202 concentrates on the idea of language.

We will study techniques for describing and reasoning about different lan-guages, from simple to complicated.

We will keep in mind the idea that understanding language and performing

computation are closely related activities.

1.7 Related Areas

The course material is drawn mainly from, and used in:

• Programming language design, definition and implementation.

• Software specification, construction and verification.

• Formal languages and automata theory.

It also draws on mathematics (especially algebra and discrete maths),logic, and linguistics.

It has applications in areas such as Problem solving, User interfacedesign, Networking protocols, Databases, Programming language de-sign, and indeed in most computer applications.

6 CHAPTER 1. INTRODUCTION

1.8 Lecture schedule

0. Problems and programs [Weeks 1–2]Describing problems, Describing algorithms and programs, Proving prop-erties of algorithms and programs.

1. Regular Languages [Weeks 3–6]Defining and recognising finite languages, Properties of strings and lan-guages, Regular expressions, Finite automata, Properties of regular lan-guages.Text book: Part I.

2. Context Free Languages [Weeks 7–10]Context free grammars, Push down automata, Parsing.Text book: Part II.

3. Computability Theory [Weeks 11-12]Recursively enumerable languages, Turing Machines, Computable func-tions.Text book: Part III.

1.9 Conclusion

• COMP 202 will look at various techniques for defining, understanding,and processing languages.

• It will be a mixture of theory and practice.

• Mastery of concepts and techniques will be assessed by tests and exams;problem sets (tutorials and assignments) will test your ability to exploremore deeply what you have learned; and projects will give you the oppor-tunity to apply knowledge in practice.

• The Course Requirements document provides definitive information, andhas links to various rules and policies about which you should be aware.READ IT!

Chapter 2

Problems and Algorithms

2.1 Problems and Algorithms

Recall that our focus in this course will be on languages.Before we start studying how to precisely define languages, let us agree on

some language and notation that we will use to talk about those definitions(metalanguage).

In particular, let us describe the (semiformal) languages that we will use to:

• define problems

• express algorithms

• prove properties.

2.2 Problems, Programs and Proofs

• How can we specify a problem (independently of its solution)?

• How shall we describe a solution to a problem?– program– machine

• What does it mean to say that a given program/machine “solves” a givenproblem?

• How can we convince ourselves (and others) that a given program/machine“solves” a given problem?

2.3 Describing problems

How can we describe a “problem” for which we wish to write a computer pro-gram?

7

8 CHAPTER 2. PROBLEMS AND ALGORITHMS

E.g. P1: add two natural numbers.

P2: sort a list of integers into ascending order.

P3: find the position of an integer x in a list l.

P4: what is the shortest route from Wellington to Auckland?

P5: is the text in file f a valid Java program?

P6: translate a C++ program into machine code.In each case, we describe a mapping from inputs to outputs.

+----------+

| |

Input ---->| P |----> Output

| |

+----------+

To be more precise, we need to specify:

1. How many inputs, and what kinds of values they can have.

2. How many outputs, and what kinds of values they can have.

3. Which of the possible inputs are actually allowed.

4. What output is required/acceptable for each allowable input.

1,2. To define number and kinds of inputs and outputs, we need to define types.

Need “basic” types, eg integer, character, Boolean.

Combine these to give tuples, sequences/lists, sets, ...

Use mathematical types, rather that arrays & records,

Input and output domains = signature.

3. To define what inputs are allowed, specify a subset of the input domainby placing a precondition on the input values.

4. To define the required/acceptable output, specify a mapping from inputsto outputs.

Maybe a function; in general, a relation.

It is formalized by giving a postcondition linking inputs and outputs.

2.4 Example: Comparing Strings

Consider the following simple problem:

Determine whether two given character strings are identical.

2.5. UNDERSTANDING THE PROBLEM 9

If you need such an operation in a program you’re writing, you might:

• Use a built-in operation (e.g. s.equal).

• Look for existing code, or a published algorithm.

• Design and code an algorithm from scratch.

Before doing any of these, you should make sure you know exactly whatproblem you’re trying to solve, by defining the signature, precondition, andpostcondition.

2.5 Understanding the ProblemDefine the interface via a signature:

input Strings s and toutput Boolean r

orEqual : String × String → Bool

Formalise the output constraint:

r ≡ s = twhere

s = t is defined by |s| = |t| and (∀i ∈ 1 .. |s|)s[i] = t[i]

Notation:

• s[i] is the ith element of s (starting from 0)

• |s| is the length of s

2.6 Designing an Algorithm

In designing an algorithm to determine whether two strings are identical, weneed to consider:

• What style of algorithm to write:

– Iterative:

while ... do...

– or recursive:

Equal(s, t)△

=...Equal(..., ...)...

10 CHAPTER 2. PROBLEMS AND ALGORITHMS

• Which condition to test first:

if |s| = |t| thenwhile ... do

...else

...

for each index position i in s doif s[i] = t[i] then

...else

......

• What operations to use in accessing the strings.

– Indexing/length

s[i] Return ith element of s (starting from 0).|s| Return length of s.

– Head/tail

head(s) Return first element of s.tail(s) Return all but first element of s.empty(s) Return true iff s is an empty string.

2.7 Defining the Algorithm Iteratively

Algorithm Equal;input String s, String t;output Boolean r where r ≡ s = tr := true;if |s| = |t| then

for k from 0 to |s| − 1 doif s[k] 6= t[k] then

r := falseelse

r := false

2.8 Defining the Algorithm Recursively

Equal(s, t)△

=if empty(s) and empty(t) then trueelsif empty(s) or empty(t) then falseelsif head(s) 6= head(t) then falseelse Equal(tail(s), tail(t))

Chapter 3

Two Simple ProgrammingLanguages

3.1 Imperative and Applicative Languages

We will define two simple programming languages: one imperative, the otherapplicative.

The language of while-programs

• imperative: has assignments and explicit flow of control

• similar in style to Pascal, Ada, C++, Java

• corresponds to the iterative algorithm style

The language of applicative programs

• applicative: concerned with applying functions to arguments andevaluating expressions

• similar in style to Lisp, Scheme, and to functional languages such asHaskell (COMP304).

• corresponds to the recursive algorithm style

3.2 The Language of While-Programs

assignment statements x := ex is a variable, e is an expressionWe (usually) don’t worry about declarations

no-operation statement skipDoes nothing at all: just like ; by itself in C++

11

12 CHAPTER 3. TWO SIMPLE PROGRAMMING LANGUAGES

sequence S1;S2

Note that ; is a separator (like Pascal, unlike C++)

selection if cond then S1 else S2

Usually no need for {}Will omit “else skip”Can chain conditions:

if C1 then S1 elsif C2 then S2 · · · else Sn

iteration while cond do S

procedures: (a bit like static methods in Java)

procedure name(parameters);begin

Send

The heading names the program, and lists its inputs and outputs.Can declare local variables (with types) after the heading: usually wewon’t bother.The program can be invoked from other programs simply by naming it: ifwe have defined

procedure A(in x; out y); begin y:=x endwe can then call

A(2, z)with the same effect as z := 2.

3.3 Comparing strings againprocedure Equal(in s, t; out r);begin

if |s| = |t| thenk := 0; r := true;while k < |s| do

if s[k] 6= t[k] thenr := false;

k := k + 1else

r := falseend

3.4 The Applicative Language

Purely applicative (or functional) languages have no assignable variables.

3.5. THOSE STRINGS AGAIN ... 13

They are more like mathematical functions.Programming in this style may seem unfamiliar, but it is much easier to get

right than imperative programming.The basic constructs are function definition and function call. For ex-

ample:

add(x, y)△

= x+ y

double(x)△

= add(x, x)

Instead of a selection statement, we have a conditional expression:if cond then E1 else E2

is an expression whose value is E1 or E2 according to the value of cond.For example:

abs(x)△

=if x ≥ 0 then

xelse

−x

Instead of a looping construct, we use recursion.The definition of a function f can include calls on f itself (mathematically a

no-no, but familiar from recursive procedures/methods in imperative program-ming).

For example:

mul(x, y)△

=if x < 0 then

−mul(abs(x), y)elsif x = 0 then

0else

add(y,mul(x− 1, y))

3.5 Those strings again ...

Equal(s, t)△

=if |s| = |t| then Equal′(s, t, 0) else false

Equal′(s, t, k)△

=if k ≥ |s| then trueelsif s[k] 6= t[k] then false

14 CHAPTER 3. TWO SIMPLE PROGRAMMING LANGUAGES

else Equal′(s, t, k + 1)

3.6 Using head and tail

Equal(s, t)△

=if isempty(s) and isempty(t) then trueelsif isempty(s) or isempty(t) then falseelsif head(s) 6= head(t) then falseelse Equal(tail(s), tail(t))

Chapter 4

Reasoning about Programs

4.1 Some Questions about Languages

1. Is program P a valid program in language L?

2. How many valid sentences are there in language L?

3. What sentences are in the intersection of languages L1 and L2?

4. How hard is the decision question for language L?

5. What does program P mean?

6. Does program P terminate when given input x?

7. What output does it produce?

8. Do programs P and Q do the same thing?

9. Is there a program in language L1 that does the same thing as programP in language L2?

10. Are languages L1 and L2 equally expressive?

4.2 Syntax

The first four questions are about syntax.They concern the form of sentences in a language, not the meanings.In Parts I and II (weeks 3–10) of the course, we will look at issues of syntax:

• What is a language?

• How can we define the set of all valid sentences?

• How can we (mechanically) decide whether a sentence is valid in a lan-guage?

15

16 CHAPTER 4. REASONING ABOUT PROGRAMS

• What relationships are there between different models of languages?

4.3 Semantics

Semantics tells us what a (syntactically valid) program means: what will happenif it is run with any given input.

Defining semantics is harder than defining syntax: we need a suitable (math-ematical) model of program execution.

Such a model may be operational, denotational, or axiomatic:

• an operational model defines some machine and a procedure for translating

programs into instructions for that machine (Part III of the course)

• a denotational model defines mathematical functions that directly capturethe behaviour of the program (COMP 304)

• an axiomatic model provides rules for reasoning about the specificationsthat a program satisfies.

We will develop a simple axiomatic model for while-programs.

4.4 Comparing Strings One More Timeprocedure Equal(in s, t; out r);begin

0 if |s| = |t| then

1 k := 0;2 r := true ;3 while k < |s| do

4 if s[k] 6= t[k] then

5 r := false;7 k := k + 18 else

9 r := false

10 end

How can we convince ourselves that it is correct?

• Testing.

• Mathematical reasoning.

We can reason about different input cases (case analysis) and the corre-sponding execution sequences:

• If |s| 6= |t|, the test at line 0 fails, and the program sets r to false (line 9),which is the correct output in this case.

• If |s| = |t|, the test succeeds, and the program sets r to true (line 2), thengoes into the loop (lines 3–7).

4.4. COMPARING STRINGS ONE MORE TIME 17

Now this kind of reasoning breaks down, because we don’t know how manytimes the program will go round the loop. We have to reason in a way that isindependent of the number of iterations performed.

This means we need induction.We need to identify some property that:

• holds every time execution reaches the top of the loop (line 3)

• guarantees the required result when the loop exits (after line 7).

When execution reaches the top of the loop, we know:

• |s| = |t|

• 0 ≤ k ≤ |s|

• r = true iff s[0 .. k − 1] = t[0 .. k − 1]

i.e. r ≡ (∀i ∈ 0 .. k − 1)s[i] = t[i]

This is called a loop invariant.We need to show that:

1. The loop invariant holds on entry to the loop.

2. If the loop invariant holds at the top of the loop, and the loop body isexecuted, it will hold again at the top of the loop.

3. If the loop invariant holds at the top of the loop, and the loop exits, therequired property (the postcondition) holds afterwards.

Together, these constitute a proof by induction that the loop does what isrequired.

1. Invariant holds on entry:

• |s| = |t| by the if condition.

• 0 ≤ k ≤ |s|, since k = 0 (line 1) and 0 ≤ |s| by definition.

• k = 0 means that (∀i ∈ 0 .. k − 1)s[i] = t[i] is trivially true; andr = true.

2. Invariant is maintained by the loop:

• |s| = |t| is unchanged, because s and t don’t change.

• 0 ≤ k < |s| is true initially (induction hypothesis), and k < |s| (loopcondition), so 0 ≤ k ≤ |s| − 1.

• r ≡ s[0 .. k − 1] = t[0 .. k − 1] initially (induction hypothesis).


Now, if s[k] = t[k] (line 4), r remains unchanged, and s[0..k] = t[0..k] iffs[0..k − 1] = t[0..k − 1].

On the other hand, if s[k] 6= t[k], r becomes false (line 5), and so doess[0..k] = t[0..k].

In either case, 0 ≤ k ≤ |s| − 1 ∧ r ≡ s[0..k] = t[0..k].Now, the next iteration of the loop will increment k, so whatever is trueof k now will be true of k − 1 at the start of the next iteration.

So the loop invariant holds again, as required.

3. Postcondition holds at loop exit:When |s| = |t|,

0 ≤ k ≤ |s|,r ≡ (s[0 .. k − 1] = t[0 .. k − 1]), andk 6< |s|

all hold,k = |s|, so r ≡ s[0 .. |s| = 1] = t[0 .. |s| − 1].

But s[0 .. |s| = 1] = s and t[0 .. |s| − 1] = t (remember |s| = |t|), hencer ≡ s = t.

Thus, the program will set r to true if s = t, and to false if s 6= t.

4.5 Laws for Program Verification

In verifying the above program, we used several kinds of knowledge and reason-ing:

• Standard laws of mathematics/logic.

• Properties of operations used in the specification and program.

E.g. indexing, string length, string equality.

• Laws for reasoning about program execution: assignments and controlstructures.

Let us know make our use of the laws about execution a little more precise.

4.6 Reasoning with Assertions

Our reasoning was based on intuition about what must be true at particularpoints in the execution.

For example, before line 3 we knew that |s| = |t|, because we were inside thethen part of the if statement at line 0.

We also knew that k = 0 and r = true because of lines 1 and 2.We can formalize that intuition by annotating the lines of the program with

assertions : logical statements that are known to be true immediately before theline is executed.

Verification then proceeds by showing that assertions are indeed satisfied.

4.6. REASONING WITH ASSERTIONS 19

procedure Equal(in s, t; out r);begin

{I0} if |s| = |t| then

{I1} k := 0;{I2} r := true ;{I3} while k < |s| do

{I4} if s[k] 6= t[k] then

{I5} r := false

{I6} else skip;{I7} k := k + 1{I8} else

{I9} r := false

{I10} end

I0 is just the precondition: in this case,

I0△

= true.

I1 is true when we are inside the then part of the if.A law for if tells us I1 = I0 ∧ |s| = |t|; that is,

I1△

= |s| = |t|.

Law for if {P}if C then {P ∧ C}S1 else {P ∧ ¬C}S2

I3 is the loop invariant. There is no easy way to find a loop invariant: theonly way is to understand how the loop works. In this case, we know that0 ≤ k ≤ |s|, and that the purpose of the loop is to decide whether s = t “up tobut not including” element k. We also know that |s| = |t|, from I1. Hence:

I3△

= 0 ≤ k ≤ |s| ∧ (r ≡ s[0 .. k − 1] = t[0 .. k − 1]) ∧ |s| = |t|.

We must show that the loop invariant holds initially; that is, {I1}k :=0; {I2}r := true{I3} is correctly annotated.

This requires reasoning about a sequence (;)of assignment statements (:=).

Law for ; {P}S1{Q} {Q}S2{R}{P}S1;S2{R}

Our S1 is k := 0 and our S2 is r := true.Our P , Q, and R are respectively I1, I2, and I3.We must find I2 such that {I1}k := 0{I2} and {I2}r := true{I3}.

Law for := {Q[e/x]}x := e{Q}

For {I2}r := true{I3}, we need I2 to be I3[true/r]: that is, I2△

= 0 ≤ k ≤|s| ∧ (true ≡ s[0 .. k − 1] = t[0 .. k − 1]) ∧ |s| = |t|.


Now, {I1}k := 0{I2} follows as long as I1 is I2[0/k]:indeed, 0 ≤ 0 ≤ |s| ∧ (true ≡ s[0 .. 0 − 1] = t[0 .. 0 − 1]) ∧ |s| = |t| is the same asI1 by simplification.

Immediately inside the loop, the loop invariant (I3) still holds, and k ≤|s| − 1, by the loop condition. (We are using a Law for while here, but let’sskip the formality.)

Hence:I4

△

= (r ≡ s[0 .. k − 1] = t[0 .. k − 1]) ∧ |s| = |t| ∧ k ≤ |s| − 1.

The if law tells us that s[k] 6= t[k] at I5, so:

I5△

= I4 ∧ s[k] 6= t[k].

The assignment law justifies adding r = false (check it!):

I6△

= I5 ∧ r = false.Before line 7, either the then part was taken, in which case I6 holds; or, it

was not taken, in which case, I4 still holds (skip law). So:

I7△

= (I6) ∨ (s[k] = t[k] ∧ I4).After line 7, we require I3 to hold again, to maintain the loop invariant.

Hence, we must prove: {I7}k := k + 1{I3}.The assignment law will justify this (check it).

After the loop exits, we know that the loop invariant I3 still holds, and thatk = |s|. Thus:

I8△

= 0 ≤ |s| ≤ |s| ∧ r ≡ s[0..|s| − 1] = t[0..|s| − 1] ∧ |s| = |t|.In the else branch of the main if, we get:

I9△

= |s| 6= |t|.In that case, we assign r := false.

Finally, at end the two branches of the outer if come together:

I10△

= (I8) ∨ (I9 ∧ r = false)which implies r ≡ s = t (exercise).

Part I

Regular Languages

21

Chapter 5

Preliminaries

5.1 Part I: Regular Languages

In this part of COMP 202 we will begin to look at formal languages, and at howthey can be defined.

In this lecture we will introduce some of the basic notions required. Insubsequent lectures we will look at:

• defining languages using regular expressions

• defining languages using finite automata

We will also explore the relationship between these ways of defining lan-guages.

5.2 Formal languages

What we say ‘formal’ we mean that we are concerned simply with the form (orthe syntax ) of the language.

We are not concerned with the meaning (or semantics) of the symbols.The theory of syntax is very much more straightforward and better under-

stood than the theory of semantics.We are not concerned with trying to understand ‘natural’ languages like

English or Maori, nor even with the question of whether the tools we develop todeal with formal languages are appropriate for dealing with natural languages.

5.3 Alphabet

Definition 1 (Alphabet) An alphabet is a finite set of symbols.

Example: Alphabets

23

24 CHAPTER 5. PRELIMINARIES

1. {1, 0}

2. {true, false, pencil, &&&&, ??? }

3. {✢, ❦, ➸, ❐, ❇ }

4. {enter, leave, left-down, right-down, left-up, right-up }

In the textbook, Cohen separates the elements of a set with space ratherthan a comma.

The symbols themselves are meaningless, so we may as well pick a convenientalphabet: Cohen uses {a b} in his examples.

Conventionally we use Σ and (occasionally) Γ as names for alphabets.

5.4 String

Definition 2 (String) A string over an alphabet is a finite sequence of symbols

from that alphabet.

Example: Some stringsExamples: if we have an alphabet Σ = {✢, ❦, ➸, ❐, ❇ } then

1. ✢

2. ❦➸

3. ❦➸❦❇➸❐➸

4.

5. ❦✢➸✢❦✢❇➸❐➸

6. ❐❦✢

are strings over Σ.String number 4 is the empty string. We need a better notation for the

empty string than simply not writing anything, so we use a meta-symbol for it.Different authors use different meta-symbols, the most common ones being

ǫ, λ, and Λ. Cohen chooses Λ, so we might as well, too.Note that Λ is a string over any alphabet, but Λ is not a symbol in the

alphabet.

5.5 Language, Word

Definition 3 (Language) A language is a set of strings.

Definition 4 (Word) A word is a string in a language.

5.6. OPERATIONS ON STRINGS AND LANGUAGES 25

Much of the rest of this course is devoted to investigating the related prob-lems of:

• how we can define languages, and

• how we can show whether or not a given string is a word in a language.

5.6 Operations on Strings and Languages

Definition 5 (Concatenation) If s and t are strings then the concatenation

of s and t, written st, is a string.

st consists of the symbols of s followed immediately by the symbols of t.We freely allow ourselves to concatenate single symbols onto strings.We can concatenate two languages. If L1 and L2 are languages then:

L1L2 = {st|s ∈ L1, t ∈ L2}

Definition 6 (Length of a string) The length of a string is the number of

occurrences of symbols in it.

If s is a string we write |s| for the length of s.We can give a recursive algorithm for | |:

• |Λ| = 0

• if x is a single symbol and y a string, then |xy| = 1 + |y|

Definition 7 (Kleene closure (Kleene *)) If S is a set then S∗ is the set

consisting of all the sequences of elements of S

Note: Λ is in S∗.We can define S∗ inductively:

• Λ ∈ S∗

• if x ∈ S and y ∈ S∗ then xy ∈ S∗

If Σ is an alphabet then Σ∗ is a language over Σ.If L is a language then L∗ is a language.

Example: Kleene closure

• if Σ = {1, 0} then Σ∗ = {Λ, 1, 0, 11, 10, 01, 00, . . .}

• if L = {❦❦, ❦➸} then L∗= {Λ, ❦❦, ❦➸, ❦❦❦❦, ❦❦❦➸, ❦➸❦❦,❦➸❦➸, . . . }

Question: what is (Σ∗)∗?


Definition 8 (Kleene +) If S is a set then S+ is the set consisting of all the

non-empty sequences of elements of S

Note: Λ is not in S+, unless it was in S.We can define S+ inductively:

• if x ∈ S then x ∈ S+

• if x ∈ S and y ∈ S+ then xy ∈ S+

We can also define S+ as SS∗

5.7 A proof

Throughout this course we will perform proofs by induction, so we begin herewith a simple one.

When we say that a string is a sequence of symbols over an alphabet Σ wemean that a string is either:

• the empty string Λ, or

• a symbol from Σ followed by a string over Σ

So, if we want to prove properties of strings over some alphabet Σ we canproceed as follows:

• prove the property holds of the empty string Λ, and

• prove the property holds of a string xy where x is an arbitrary element ofΣ and y an arbitrary string over Σ assuming the property holds of y.

More formally, in order to show P (s) for an arbitrary string s over an alpha-bet Σ, show:

Base case P (Λ)

Induction step P (xy), given P (y), where x is an arbitrary element of Σ, y anarbitrary string over Σ

This sort of pattern of inductive proof occurs in very many situations in themathematics required for computer science.

5.8 Statement of the conjecture

The conjecture that we will prove by induction is:

Conjecture 1 |st| = |s| + |t|

In English: “the length of s concatenated with t is the sum of the lengths ofs and t”.

We proceed by induction on s, where P (s) is |st| = |s| + |t|.

5.9. BASE CASE 27

5.9 Base case

P (Λ) is |Λt| = |Λ| + |t|.We show that the LHS and the RHS of this equality are just the same.On the LHS we use the fact that Λt = t, to give us |t|On the RHS we observe that |Λ| = 0 and 0 + n = n to give us |t|.Thus LHS = RHS, and the base case is established.

5.10 Induction step

P (xy) is |(xy)t| = |xy| + |t|.We must show |(xy)t| = |xy| + |t|, given |yt| = |y| + |t|.On the LHS we use the associativity of concatenation to observe that (xy)t =

x(yt).Next we use the definition of | | to see that the LHS is 1 + |yt|.The RHS is |xy| + |t| We use the definition of | | to see that the RHS is

1 + |y| + |t|.Now we use the induction hypothesis |yt| = |y| + |t| to show that the RHS

is 1 + |yt|.So we have shown that LHS = RHS in the induction step.

5.11 A theorem

We have established both the base case and the induction step, so we havecompleted the proof.

Now our conjecture can be upgraded to a theorem:

Theorem 1 |st| = |s| + |t|


Chapter 6

Defining languages usingregular expressions

6.1 Regular expressions

In this lecture we will see how we can define a language using regular expressions.The first step is to give an inductive definition of what a regular expression

is.Before we define regular expressions we will give a similar definition for

expressions of arithmetic, to illustrate the method used.

6.2 Arithmetic expressions

We will now give a formal inductive description of the expressions we have inarithmetic.

• if n ∈ N then n is an arithmetic expression;

• if e1 and e2 are expressions then so are

– −(e1)

– (e1 + e2),

– (e1 − e2),

– (e1 ∗ e2),

– (e1/e2),

– (ee2

1 ) (we could have chosen to use a symbol, such as ^, rather thanjust use layout)

This definition forces us to include lots of brackets:

• ((1+2)+3)

29

30CHAPTER 6. DEFINING LANGUAGES USING REGULAR EXPRESSIONS

• (1+(2+3))

are expressions, but

• 1+(2+3)

• 1+2+3

are not.

6.3 Simplifying conventions

We do not need to use all these brackets if we adopt some simplifying conven-tions.

First we allow ourselves to drop the outermost brackets, so we can write:

• (1+2)+3

rather than:

• ((1+2)+3)

We also adopt some conventions about the precedence and associativity ofthe operators.

6.4 Operator precedence

• + and − are of lowest precedence

• ∗ and / are next

• ^ is highest

So we can write:

• 2 + 3 ∗ 45

rather than:

• 2 + (3 ∗ (45))

6.5 Operator associativity

We also adopt a convention that all the operators are left associative. Thismeans that if we have a sequence of operators of the same precedence we fill inthe brackets to the left.

So we can write:

• 1 + 2 + 3

rather than:

• (1 + 2) + 3

We can have right associative operators, or non-associative operators.

6.6. CARE! 31

6.6 Care!

Be careful not to confuse the two statements:

1. # is a left associative operator

2. # is an associative operation

1 means that we can write x#y#z instead of (x#y)#z.

2 means that (∀xyz).(x#y)#z = x#(y#z).

In the examples above + and / are both left associative. However, additionis associative, but division is not:

60/6/2 = 5

60/(6/2) = 20

6.7 Defining regular expressions

If Σ is an alphabet then:

• the empty language ∅ and the empty string Λ are regular expressions

• if r ∈ Σ then r is a regular expression

• if r is a regular expression then so is (r∗)

• if r and s are regular expressions then so are

– (r + s)

– (rs)

6.8 Simplifying conventions

We allow ourselves to dispense with outermost brackets.∗ has the highest precedence and + the lowest.

We take all operators to be left-associative, so we write:

• r + s + t instead of (r + s) + t

• rst instead of (rs)t


6.9 What the regular expressions describe

So far we have said what the form of a regular expression is, without explainingwhat it describes. This is as if we had explained the form of the arithmeticexpressions without explaining what the operations of addition, subtraction andso on were.

The regular expressions describe sets of strings over Σ, that is languages, sowe must explain what languages the regular expressions describe.

We write Language(r) for the language described by regular expression r.When we are being sloppy we use r for both the regular expression and the

language it describes.

Language(∅)The language described by ∅ is the empty language {}.

Language(Λ)The language described by Λ is {Λ}.

Note {} 6= {Λ}

Language(r), r ∈ ΣThe language described by r, r ∈ Σ is {r}

Example:If Σ = {0, 1} then:

Language(0) = {0}

Language(1) = {1}

Language(r∗)The language described by r∗ is the Kleene closure of the language de-scribed by r.


Language(0∗) = {Λ, 0, 00, 000, . . .}

Language(1∗) = {Λ, 1, 11, 111, . . .}

Language(r + s)The language described by r + s is the union of the languages describedby r and s.


Language(1 + 0) = {1, 0}

Language(1∗ + 0∗) = {Λ, 0, 1, 00, 11, 000, 111, . . .}

6.10. DERIVED FORMS 33

Language(rs)The language described by rs is the concatenation of the languages de-scribed by r and s.


Language(10) = {10}

Language((1 + 0)(1 + 0)) = {11, 10, 01, 00}

Language((1 + 0)0∗) = {1, 0, 10, 00, 100, 000, 1000, . . .}

6.10 Derived forms

Sometimes we use these derived forms:

• r+ =def rr∗

• rn =def rr . . . rr︸︷︷︸

n rs


Chapter 7

Regular languages

7.1 Regular Languages

In the last lecture we introduced the regular expressions and the operations thatthe regular expressions correspond to.

The regular expressions allow us to describe languages. A language whichcan be described by a regular expression is called a regular language.

If Σ is an alphabet, r ∈ Σ is a symbol, and r is a regular expression, then:

• Language(∅) is ∅

• Language(Λ) is {Λ}

• Language(r), r ∈ Σ is {r}

• Language(r∗) is (Language(r))∗

• Language(r + s) is Language(r) ∪ Language(s)

• Language(rs) is Language(r)Language(s)

7.2 Example languages

In this lecture we will look at the languages which we can describe with regularexpressions. We will usually restrict attention to Σ = {0, 1}Example: Language(0 + 1)

Language(0 + 1)= Language(0) ∪ Language(1)= {0} ∪ {1}= {0, 1}

35

36 CHAPTER 7. REGULAR LANGUAGES

Example: Language(1 + 0)

Language(1 + 0)= Language(1) ∪ Language(0)= {1} ∪ {0}= {1, 0}

We see that 1 + 0 and 0 + 1 describe the same language

Two regular expressions which describe the same language are equal regularexpressions, so we can write:

1 + 0 = 0 + 1

We can be more general, however. For all regular expressions, r, s, we have:

r + s = s + r

because for all sets S1, S2 we have S1 ∪ S2 = S2 ∪ S1

The difference between 1 + 0 = 0 + 1 and r + s = r + s is just the same asthe difference between 3 + 2 = 2 + 3 and x+ y = y + x.

You would not infer x+ x = x ∗ x from 2 + 2 = 2 ∗ 2 in ordinary arithmetic.

Example: Language((0 + 1)∗)

Language((0 + 1)∗)= (Language(0 + 1))∗

= (Language(0) ∪ Language(1))∗

= ({0} ∪ {1})∗

= {0, 1}∗

= {Λ, 0, 1, 00, 01, 10, 11, . . .}

Example: Language((0∗1∗)∗)

Language((0∗1∗)∗)= (Language(0∗1∗))∗

= (Language(0∗)Language(1∗))∗

= ({c1 . . . cn|ci ∈ {0}, n ∈ N}{Λ, 1, 11, . . .})∗

= {Λ, 1, 11, . . . , 0, 01, 011, . . .}∗

= {Λ, 0, 1, 00, 01, 10, 11, . . .}

The reasoning above shows that: (0 + 1)∗ = (0∗1∗)∗

How would you generalise the reasoning given above to show for all regularexpressions, r, s:

(r + s)∗ = (r∗s∗)∗?

7.3. FINITE LANGUAGES ARE REGULAR 37

7.3 Finite languages are regular

A finite language is one which has a finite number of words.

Theorem 2 Every finite language is a regular language.

ProofA language is regular if it can be described by a regular expression.A finite language can be written as: {w1, . . . , wn} for some n ∈ N . This

means we can write it as: {w1} ∪ . . . ∪ {wn}.Note: Recall that if n = 0 in the above then we are dealing with the empty

language, which is certainly regular.If we can represent each of the {wi} as a regular expression wi, then the

language is described by w1 + . . .+ wn.Each word is just a string of symbols s1 . . . sm for some m ∈ N . So each

word is the only word in the language s1 . . . sm.Note: Recall that if m = 0 in the above then we are dealing with the empty

string, which is certainly a regular expression.Hence, every finite language can be described by a regular expression.

Example: Finite languages{1, 11, 111, 1111} is Language(1 + 11 + 111 + 1111)

{00, 01, 10, 11} is Language(00 + 01 + 10 + 11)Notice that {00, 01, 10, 11} is also Language((1 + 0)(1 + 0)), but to show

that the language is regular we only need to provide one regular expressiondescribing it.

7.4 An algorithm

The proof that every finite language is regular provides us with an algorithm

which takes a finite language and returns a regular expression which describesthe language.

Many of the proofs that we see in this course have an algorithmic character.Such algorithmic proofs are often called constructive, and provide the basis fora program.

Conversely every program is a constructive proof (even if we are usually notinterested in what it is a proof of).

7.5 EVEN-EVEN

We now describe a language which Cohen introduces, which he calls EVEN-EVEN.

EVEN-EVEN = Language((00 + 11 + (01 + 10)(00 + 11)∗(01 + 10))∗)It may not be immediately obvious what language this is (or rather, if there

is a simple description of this language).However every word of EVEN-EVEN contains an even number of 0’s and an

even number of 1’s.


Furthermore EVEN-EVEN contains every string with an even number of 0’sand an even number of 1’s.

7.6 Deciding membership of EVEN-EVEN

Suppose we are given the task of writing a program to decide whether a givenstring is in EVEN-EVEN. How would we go about this task?

One method would be to use two counters, n0 and n1, and to go throughthe string counting all the 0’s and all 1’s. If n0 and n1 are both even at the endof the string then the string is in EVEN-EVEN.

If we call the pair 〈n0, n1〉 a state of the program then our program willneed to be able to go through as many states as there are symbols in a stringto decide membership.

Is there a program which uses fewer states?

7.7 Using fewer states

We don’t actually care now many 0’s and 1’s there are, so we could use twoboolean flags b0 and b1. As we go through the string we flip the appropriate flagas we read each symbol. If both the flags end up in the same state that theystarted in then the string is in EVEN-EVEN.

If we call the pair 〈b0, b1〉 a state of the program then our program will needto be able to go through at most four distinct states to decide membership.

7.8 Using even fewer states

Suppose, instead of reading the symbols one by one, we read them two by two.Now we need only use one boolean flag, and we do not have to flip it all thetime.

If both the symbols we read are the same then we leave the flag alone. Ifthey differ then we flip the flag. Two different symbols means we have just readand odd number of 0’s and 1’s. If we had read an even number before we havenow read an odd number, if we had read an odd number before we have nowread an even number.

If the flag ends up in the same state it began in then the string is in EVEN-EVEN.

The program need go through at most two distinct states to decide mem-bership.

7.9 Uses of regular expressions

Regular expressions turn out to have practical uses in a variety of places.

7.9. USES OF REGULAR EXPRESSIONS 39

We can informally think of a regular expression as describing the most typical

string in a language.The tokens of a programming language can usually be described by regular

expressions.For example, a type name might be an uppercase letter followed by a se-

quence of uppercase letters, lowercase letters or underscores, and so on.The task of identifying the tokens in a program is called lexical analysis,

and is the first step in compiling the program. The UNIX utility lex is alexical-analyser generator: given regular expressions describing the tokens ofa language it generates a program which will perform lexical analysis. Sinceregular expressions describe typical strings they are the basis for many searchingand matching utilities.

The UNIX utility grep allows the user to specify a regular expression tosearch for in a file.

Many text editors provide a facility similar to grep for searching for strings.Programming languages which are oriented towards text processing usually

allow us to describe strings using regular expressions.


Chapter 8

Finite automata

8.1 Finite Automata

We leave regular expressions and introduce finite automata.Finite automata provide us with another way to describe languages.Note Cohen uses the term ‘finite automaton’ (‘FA’) where many other au-

thors use the term ‘deterministic finite automaton’ (‘DFA’). After we have intro-duced deterministic finite automata we will introduce non-deterministic finiteautomata. In common with everybody else, Cohen calls these ‘NFAs’.

8.2 Formal definition

Definition 9 (Finite automaton) A finite automaton is a 5-tuple (Q,Σ, δ, q0, F )where:

• Q is a finite set: the states

• Σ is a finite set: the alphabet

• δ is a function from Q× Σ to Q: the transition function

• q0 ∈ Q: the start state

• F ⊆ Q: the final or accepting states.

Note: F may be Q, or F may be the empty set.

8.3 Explanation

While this is all very well it does not help us see what finite automata do, orhow.

Suppose we have a string made up of symbols from Σ. We begin in the startstate q0. We read the first symbol s from the string, and then enter the state

41

42 CHAPTER 8. FINITE AUTOMATA

given by δ(q0, s). We then read the next symbol from the string and use thetransition function to move to a new state, repeating this process until we reachthe end of the string.

If we have read all the string, and are in one of the accepting states we saythat the automaton accepts the string.

The language accepted by the automaton is the set of all the strings itaccepts.

So automata give us a way to describe languages.

8.4 An automaton

Suppose we have an automaton M1 = (Q,Σ, δ, q0, F ), where:

• Q = {S1, S2, S3}

• Σ = {0, 1}

• δ(S1, 1) = S2

• δ(S2, 1) = S3

• q0 = S1

• F = {S3}

Lets see if M1 accepts 1.We begin in state S1 and we read 1. δ(S1, 1) = S2 so we enter S2.Our string is empty, but S2 is not a final state, so M1 does not accept 1.Lets see if M1 accepts 11. We begin in state S1 and we read 1. δ(S1, 1) = S2

so we enter S2.Now we read 1. δ(S2, 1) = S3 so we enter S3.Our string is empty, and S3 is an accepting state so M1 accepts 11.Lets see if M1 accepts 0. We begin in state S1 and we read 0. δ is a partial

function, and there is no value given for δ(S1, 0). We cannot make any progresshere, and so 0 is not in the language accepted by M1. We can think of themachine as ‘crashing’ on this string.

Note: here we differ from Cohen. He insists (initially, at least) that δ betotal, and adds a ‘black hole’ state to all his machines, whereas we allow δ tobe partial.

Cohen would have an new state S4, and would extend δ with:

• δ(S1, 0) = S4

• δ(S2, 0) = S4

• δ(S4, 0) = S4

• δ(S4, 1) = S4

Since S4 is not an accepting state, and since once we enter it there is no wayto leave, this extension does not change the language accepted by the machine.

8.5. THE LANGUAGE ACCEPTED BY M1 43

8.5 The language accepted by M1

A moment’s reflection will show that the only string which M1 accepts is 11,and so the language M1 accepts is Language(11).

8.6 Another automaton

Suppose we have an automaton M2 = (Q,Σ, δ, q0, F ), where:

• Q = {S1, S2, S3}

• Σ = {0, 1}

• δ(S1, 1) = S2

• δ(S2, 1) = S3

• δ(S3, 0) = S3

• δ(S3, 1) = S3

• q0 = S1

• F = {S3}

M2 is nearly the same as M1, but now we have transitions from S3 to itself onreading either a 0 or a 1.

What language does M2 accept?Clearly the only way to get from S1 to S3 is to begin with two 1’s. If the

string is just 11 it will be accepted.What about 111? This will be accepted too, as δ(S3, 1) = S3.What about 110? This will be accepted too, as δ(S3, 0) = S3.In fact any extension of 11 will be accepted, so we see that M2 accepts

Language(11(0 + 1)∗).

8.7 A pictorial representation of FA

It is often easier to see what an FA does if we draw a picture of it.We can draw a finite automaton out as a labelled, directed graph. Each state

of the machine is a node, and the transition function tells us which nodes areconnected by which edges. We mark the start state, and any accepting statesin some special way.

M1 can be represented as:

GFED@ABC1−1

//?>=<89:;21

// GFED@ABC3+

M2 can be represented as:

GFED@ABC1−1

//?>=<89:;21

// GFED@ABC3+ 1,0

qq


Note: Sometimes the start state is pointed to by an arrow, and the acceptingstates are drawn as double circles or squares, e.g.:

//?>=<89:;11

//?>=<89:;21

//?>=<89:;/.-,()*+3 1,0vv

or//?>=<89:;1

1//?>=<89:;2

1// 3 1,0

vv

8.8 Examples of constructing an automaton

If L is a language over some alphabet Σ, then L, the complement of L, is{s|s ∈ Σ∗&s 6∈ L}.

If we have some automaton ML, which accepts L, then we can construct anautomaton ML which accepts L.

ML will accept just the strings which ML does not, and will not accept justthe strings which ML does.

To recognise the complement of a language, we can think of ourselves asgoing through the same steps as we would take to recognise the language, butmaking the opposite decision at each state.

We then expect ML and ML, to have the same states, but an accepting stateof ML will not be an accepting state of ML and vice versa.

8.9 Constructing ML

M2 from above accepts Language(11(0 + 1)∗).Our first suggestion for a machine, M3 to accept Language(11(0 + 1)∗) is

to have:

• the same set of states,

• the same transition function,

• the complement of the set of accepting states of M2.

M3 = (Q,Σ, δ, q0, F ), where:

• Q = {S1, S2, S3}

• Σ = {0, 1}

• δ(S1, 1) = S2

• δ(S2, 1) = S3

• δ(S3, 0) = S3

• δ(S3, 1) = S3

8.10. A MACHINE TO ACCEPT LANGUAGE (11(0 + 1)∗) 45

• q0 = S1

• F = {S1, S2}

F (M3) is Q− F (M2), as you would expect. Graphically:

ONMLHIJK1−,+1

// GFED@ABC2+1

//?>=<89:;3 1,0vv

This automaton correctly accepts Λ and the string 1.What about the string 0?

The machine crashes, so, close, but no cigar.

The problem is that M2 and M3 both crash on the same strings: M3 shouldaccept the strings that M2 crashes on, and vice versa.

What we have to do is to convert our initial machine into a machine whichaccepts the same language and whose transition function is total. We can alwaysdo this. (How?)

Then we construct a third machine whose set of accepting states is thecomplement of that of the second machine.

8.10 A machine to accept Language(11(0 + 1)∗)

M4 = (Q,Σ, δ, q0, F ), where:

• Q = {S1, S2, S3, S4}

• Σ = {0, 1}

• q0 = S1

• F = {S1, S2, S4}

• δ(S1, 0) = S4

• δ(S1, 1) = S2

• δ(S2, 0) = S4

• δ(S2, 1) = S3

• δ(S3, 0) = S3

• δ(S3, 1) = S3

• δ(S4, 0) = S4

• δ(S4, 1) = S4

Graphically:

ONMLHIJK1−,+

GF ED0

//

1// GFED@ABC2+

1//

@A BC0

//

?>=<89:;3 1,0vv GFED@ABC4+ 1,0

qq


8.11 Summary

We have outlined a method which allows us to construct a machine which acceptsL from a machine which accepts L.

We can think of this construction a proof of the theorem:

Theorem 3 If a language L can be defined using an FA, then so can the lan-

guage L

Chapter 9

Non-deterministic Finiteautomata

9.1 Non-deterministic Finite Automata

We now introduce a variant on the Finite Automaton, the Non-deterministic

finite automaton (NFA).

The difference between an NFA and an FA is that, in an NFA more than

one arc leading from a state may be labelled by the same symbol.

Formally, the transition function of an NFA takes a state and a symbol andreturns a set of states.

A string can now label more than one path through the automaton.

The string itself does not determine which state we will end up when weread it: there is some non-determinism built into the machine.

9.2 Preliminaries

PowersetIf A is a set then 2A is the powerset of A,i.e. the set of all subsets of A.

{} ∈ 2A

A ∈ 2A

Partial and total functionsA partial function is not defined for some values of its domain. If f is apartial function and x is in the domain of f we write f(x) ↓ if f(x) is

defined.

47

48 CHAPTER 9. NON-DETERMINISTIC FINITE AUTOMATA


Definition 10 (Non-deterministic finite automaton) A non-deterministic

finite automaton is a 5-tuple (Q,Σ, δ, q0, F ) where:



• δ is a function from Q× Σ to 2Q: the transition function



Note: δ is always total.

9.4 Just like before. . .

An NFA accepts a string if there is a path from the start state to an acceptingstate labelled by the string.

The set of strings accepted by a NFA is the language it accepts.

9.5 Example NFA

M5 = (Q,Σ, δ, q0, F ), where

• Q = {S1, S2, S3, S4}

• Σ = {0, 1}

• q0 = S1

• F = {S2, S3}

• δ(S1, 0) = {S2, S3}

• δ(S1, 1) = {}

• δ(S2, 0) = {S4}

• δ(S2, 1) = {S4}

• δ(S3, 0) = {}

• δ(S3, 1) = {S3}

• δ(S4, 0) = {S2}

• δ(S4, 1) = {S2}For clarity and conciseness, we sometimes choose to show δ as a table:δ 0 1S1 {S2, S3} {}S2 {S4} {S4}S3 {} {S3}S4 {S2} {S2}

And, of course we can give a pictorial representation. M5 can be drawn:

9.6. M5 ACCEPTS 010 49

GFED@ABC3+111

GFED@ABC1−0

//0oo GFED@ABC2+0,1

//?>=<89:;4EDGF

0,1

oo

9.6 M5 accepts 010

Now let’s see whether some strings are in the language defined by M5. We willtry to find a path from the start state to an accepting state.

We begin with 010.

We start in S1, and the first symbol in the string is 0. Now we have a choiceas there are two transitions out of S1 labelled by 0.

We choose to go to S2. The 1 takes us to S4, and the final 0 brings us backto S2. We have exhausted our string, and we are in an accepting state, so M5

accepts 010.

9.7 M5 accepts 01

Next we try 01

We start in S1, and the first symbol in the string is 0. Now we have a choiceas there are two transitions out of S1 labelled by 0.

As before we choose to go to S2. The next transition takes us to S4. Ourstring is exhausted, but we are not in an accepting state.

If this were a deterministic automaton then we would know that the stringwas not in the language. However this is a non-deterministic machine, and theremay be another path from the start state to an accepting state labelled by 01.

We backtrack to the place where we made a choice, and pick the other arclabelled by 0. We see that this string is accepted by M5.

9.8 Comments

A little thought shows that M5 accepts Language(01∗ + 0((1 + 0)(1 + 0))∗).

The following FA, M6 = (Q,Σ, δ, q0, F ), accepts the same language:

• Q = {S1, S2, S3, S4, S5, S6}

• Σ = {0, 1}

• q0 = S1

• F = {S3, S5, S6}

δ 0 1S1 S3 S2

S2

S3 S4 S5

S4 S6 S6

S5 S6 S3

S6 S4 S4


9.9 NFA’s are as powerful as FA’s

Now we move on to prove two theorems about the languages definable by FA’sand NFA’s.

Theorem 4 Every language definable by an FA is definable by a NFA.

The proof of this consists of an algorithm which takes an arbitrary FA, andconstructs an NFA which accepts the same language.

9.10 Proof

We have an FA, MFA = (Q,Σ, δ, q0, F ), and we will construct an NFA MNFA =(Q′,Σ′, δ′, q′0, F

′), which accepts the same language.Clearly Σ′ = Σ is the only sensible choice. It also seems reasonable to keep

the same structure for the machine and take:

• Q′ = Q

• q′0 = q0

• F ′ = F

The only real difference is in the two transition functions:

• one may be partial, the other is total;

• the range of one is states, the range of the other is sets of states.

The solution is simple:

δ′(S, σ) = if δ(S, σ) ↓ then {δ(S, σ)} else {}

The definition that we have given (Q′,Σ′, δ′, q′0, F′) is certainly an NFA. It

should also be clear that any string which labels a path from the start state toan accepting state in the FA will do so in the NFA, and only strings which labelsuch a path in the FA will do so in the NFA.

Hence the two languages are the same.Thus we have shown that the descriptive power of NFA’s is at least as strong

as that of FA’s.This should not be a surprise, as NFA’s were introduced as a generalisation

of FA’s.

9.11 Using the graphical representation

Given that the graphical representation of an FA is the graphical representationof an NFA which accepts the same language we could have just drawn the pictureand said “Look!”

That is too easy to be a real proof.

9.12. FA’S ARE AS POWERFUL AS NFA’S 51

9.12 FA’s are as powerful as NFA’s

Theorem 5 Every language definable by an NFA is definable by a FA.

The proof of this consists of an algorithm which takes an arbitrary NFA,and constructs a FA which accepts the same language.

This is a more remarkable result as NFA’s were introduced as a generalisationof FA’s.

Alas we cannot look at a picture and go “Ha!” We are forced to make acareful construction.

The key idea is that the states of the FA we construct will be sets of states

of the original NFA.When we tried to see if M5 would accept strings we used a strategy rather

like depth-first search. We pursued a path until we were either successful orstymied, and then retraced our steps to make alternate choices.

Suppose instead we had kept a record of all the possible states we could havereached as we worked our way along the string. This is rather like breadth-firstsearch.

For the string 010 and M5 we could have reached either of the states {S2, S3}after we had read 0, any state reachable from S2 or S3 on 1 next and so on.

9.13 Useful observations

• If T is a set of states of some machine then T ∈ 2Q.

• If Q if finite then so is 2Q

• Given some T ∈ 2Q then the set of all states reachable on some symbolσ ∈ Σ is

⋃{δ(t, σ)|t ∈ T }.

• For each T ∈ 2Q, σ ∈ Σ there is just one such⋃{δ(t, σ)|t ∈ T }

• Because there is just one set of states reachable on a given symbol from agiven set of states we have got determinism back.

We still have to sort out what the start states and accepting states are.The singleton set of the start state of the NFA is the obvious candidate for

the start state for our FA.A string is accepted by the NFA if there is any path from the start state to

an accepting state labelled by that string.Therefore we will take every set which contains an accepting state of the

NFA to be an accepting state of the FA we are constructing.Where are we? We now know:

• what the alphabet will be

• what the states of our FA look like,


• what the start state of our FA will be

• what the accepting states will be

• what the transition function will look like.

We still have to give a full description of the states and the transition func-tion.

9.14 Construction

After all that we now give a method to construct an FA

MFA = (Q′,Σ′, δ′, q′0, F′)

which accepts the same language as an NFA

MNFA = (Q,Σ, δ, q0, F )

As expected: Σ′ = Σ, and q′0 = {q0}.We will have Q′ ⊆ 2Q. We don’t take Q′ = 2Q, as we need only concern

ourselves with the states we can actually reach.

Hence we construct δ′ and Q′ in tandem.

9.15 Constructing δ′ and Q′

Start with q′0 = {q0}.δ′(q′0, σ) = δ(q0, σ) where σ ∈ Σ

This will generate new states of the FA, and we continue this process, con-structing δ′ using δ until no new states are created.

Any state of the FA which contains an accepting state of the NFA is anaccepting state of the FA.

Finally we may tidy things up by giving nice names to the states of the FA!

9.16 Example

Suppose we have an NFA = (Q,Σ, δ, q0, F ), where:

Q = {S1, S2, S3, S4, S5, S6}

Σ = {0, 1}

q0 = S1

F = {S2, S3, S6}

δ 0 1S1 {} {S2, S5}S2 {S3, S4} {}S3 {S3} {S3}S4 {S2} {S2}S5 {} {S6}S6 {S6} {}

9.17. NFA WITH Λ TRANSITIONS 53

We define an FA (Q′,Σ′, δ′, q′0, F′), beginning with

Σ′ = Σ

q′0 = {S1}

We construct δ′:δ′ 0 1[=1]{S1} [=2]{} [=4]{S2, S5}[=3]{}[=5]{S2, S5} [=6]{S3, S4} [=8]{S6}[=7]{S3, S4} [=10]{S2, S3} [=12]{S2, S3}[=9]{S6} [=13]{S6} [=14]{}[=11]{S2, S3} [=15]{S3, S4} [=16]{S3}[=17]{S3} [=18]{S3} [=19]{S3}

Then:

F ′ = {{S2, S5}, {S3, S4}, {S6}, {S2, S3}, {S3}}

Q′ = {{S1}, {}, {S2, S5}, {S3, S4}, {S6}, {S2, S3}, {S3}}

9.17 NFA with Λ transitions

We have seen various ways of defining languages:

1. regular expressions

2. finite automata

3. non-deterministic finite automata

We have shown that, although 3 look like an extension of 2 they can acceptexactly the same languages.

Now we will introduce an extension of NFA’s, NFA with Λ transitions.

9.18 Motivation

We will show that NFA’s with Λ transitions can be used to accept exactly the

same languages as NFA’s.Since we have already shown that that NFA’s and FA’s can be used to accept

exactly the same languages it follows that NFA’s with Λ transitions and FA’scan be used to accept exactly the same languages.

We introduce NFA’s with Λ transitions because it is easy to construct an NFAwith Λ transitions which accepts the language described by a regular expression.

An FA is an easily implementable recogniser for a language, so we have away to go from a description of typical strings in the language to a recogniser.

We can extend the notion of an NFA to include automata with Λ transitions,that is with transitions that may occur on reading no symbols from the string.

The main difference is in the transition function: δ is now a function fromQ× {Σ∪{Λ}} to 2Q.



Definition 11 (Non-deterministic finite automaton with Λ transitions)A non-deterministic finite automaton with Λ transitions is a 5-tuple (Q,Σ, δ, q0, F )where:



• δ is a function from Q× {Σ ∪ {Λ}} to 2Q: the transition function



9.20 Example

M6 = (Q,Σ, δ, q0, F ) where:

Q = {S1, S2, S3, S4, S5}

Σ = {0, 1}

q0 = S1

F = {S5}

δ 0 1 ΛS1 {S2} {S3} {S2}S2 {S4} {S3, S4} {}S3 {} {S4, S5} {S5}S4 {S4, S5} {S4, S5} {}S5 {} {} {S1}

9.21 Graphical representation of M6

?>=<89:;2a,b

//

b

��

?>=<89:;4

a,b

ss

a,b

��

GFED@ABC1−

a,Λ66nnnnnnnnnn

b

''PPPPPPPPPP

?>=<89:;3

b

>>}}}}}}}}}}}}}} b,Λ// GFED@ABC5+

Λ

YY

9.22. EASY THEOREM 55

9.22 Easy theorem

Theorem 6 Every language definable by an NFA is definable by a NFA with Λtransitions.

The proof is easy, as the machines are almost identical.If M8 = (Q,Σ, δ, q0, F ) is an NFA, then M9 = (Q′,Σ′, δ′, q′0, F

′) is an NFAwith Λ transitions which accepts the same language, if:

Q′ = Q Σ′ = Σ q′0 = q0 F ′ = F

And for S ∈ Q, σ ∈ Σ:

δ′(S, σ) = δ(S, σ)

δ′(S,Λ) = {}

Once again we could have just drawn the diagram representing the NFA,and announced that it also represented an NFA with Λ transitions.

9.23 Harder theorem

Theorem 7 Every language definable by an NFA with Λ transitions is definable

by a NFA.

As you might expect most of the hard work lies in constructing the transitionfunction.

If M10 = (Q,Σ, δ, q0, F ) is an NFA with Λ then we will construct M11 =(Q′,Σ′, δ′, q′0, F

′), an NFA which accepts the same language.We begin with:

Q′ = Q

Σ′ = Σ

q′0 = q0

9.24 δ′ the new transition function

δ′(S, σ) must now give us the set of states which are reachable from S on a σtransition and also on any combination of Λ transitions and a σ transition.

Some care is needed, as we may be able to make several Λ transitions before

we make the σ transition, and we may be able to make make several Λ transitionsafter we make the σ transition.

We need to find every state which is reachable from S by:

(Λ transition)∗(σ transition)(Λ transition)∗

We can call this the Λ closure of δ(S, σ).Another way to think of what we are doing is that we are removing the arcs

labelled with Λ, and adding ‘direct’ arcs labelled with a symbol.


9.25 F ′ the new set of accepting states

We might initially guess that F ′ = F , but this is wrong.Suppose q0 is not an accepting state, but there is some accepting state which

can be reached by a sequence of Λ transitions from q0 i.e. the initial state is notan accepting state but the NFA with Λ transitions accepts the empty string.

If this is the case then F ′ = F ∪ {q0}, otherwise F ′ = F .

9.26 Example: an NFA equivalent to M6

M6 = (Q,Σ, δ, q0, F ) as above.M12 = (Q′,Σ′, δ′, q′0, F

′) where Q′ = Q, Σ′ = Σ, q′0 = q0.There is no accepting state reachable on Λ transitions from S1, so F ′ = F .We construct δ′:δ′ 0 1S1 {S2, S4} {S1, S2, S3, S4, S5}S2 {S4} {S1, S2, S3, S4, S5}S3 {S2, S4} {S1, S2, S3, S4, S5}S4 {S1, S2, S4, S5} {S1, S2, S4, S5}S5 {S2, S4} {S1, S2, S3, S4, S5}


9.28 Regular expression to NFA with Λ transi-

tions

Now we will show:

• any language which is describable by a regular expression is describableby a non-deterministic finite automaton with Λ moves,

• hence any language which is describable by a regular expression is describ-able by a non-deterministic finite automaton,

• hence any language which is describable by a regular expression is describ-able by a finite automaton.

9.29 Proof outline

The proof is by induction, and its algorithmic content allows us to write aprogram which takes a regular expression and constructs an automaton whichrecognises the same language.

Recall that the regular expressions are defined as follows: If Σ is an alphabetthen:

9.30. BASE CASES 57

• the empty language ∅ and the empty string Λ are regular expressions

• if r ∈ Σ then r is a regular expression

• if r and s are regular expressions then so are r + s, rs, r∗

9.30 Base cases

The base cases are:

• the empty language ∅

• the empty string Λ

• if r ∈ Σ then r

In these cases we just present an appropriate automaton.

9.31 Induction steps

The induction steps are:

• r + s

• rs

• r∗

In these cases we assume that we have automata to accept r and s and wemake use of these machines.

9.32 Proof

Base case: ∅The following NFA with Λ transitions accepts no strings.

M∅ = (Q,Σ, δ, q0, F ) where:

Q = {S1} Σ = {} q0 = S1 δ(S1,Λ) = {} F = {}

Informally: a machine with no accepting states (and not much else).

Base case: ΛThe following machine accepts the empty string. MΛ = (Q,Σ, δ, q0, F )where:

Q = {S1} Σ = {} q0 = S1 δ(S1,Λ) = {} F = {S1}

Informally: a machine whose start state is the only accepting state.


Base case: r, r ∈ ΣThe following machine accepts Language(r), for some r ∈ Σ.

Mr = (Q,Σ, δ, q0, F ) where:

Q = {S1, S2} q0 = S1 F = {S2}

δ r σ 6= r ΛS1 {S2} {} {}S2 {} {} {}

Informally: a machine with one accepting state, which can only be arrivedat on a r.

Induction step: r∗

Assume that we have an automaton Mr which accepts Language(r) andshow how to make an automaton which accepts Language(r∗).

Informally: we need a way to allow ourselves to go through Mr 0, 1, 2, 3,. . . times. We make the initial state of Mr an accepting state, and add aΛ transition from each accepting state to the initial state.

Let Mr = (Q,Σ, δ, q0, F ) be an NFA with Λ transitions which acceptsLanguage(r). Then Mr∗ = (Q′,Σ′, δ′, q′0, F

′), where:

Q′ = Q Σ′ = Σ q′0 = q0 F ′ = F ∪ {q0}δ′(S, σ) = δ(S, σ)δ′(S,Λ) = δ(S,Λ), S 6∈ Fδ′(S,Λ) = δ(S,Λ) ∪ {q0}, S ∈ F

is an NFA with Λ transitions which accepts Language(r∗).

Induction step: r + sAssume that we have automata Mr and Ms which accept Language(r)and Language(s) and show how to combine them into a machine to acceptLanguage(r + s).

Informally: we need a way to allow ourselves to go through either Mr orMs. We can do this if we add a new start state with Λ transitions to thestart states of Mr and Ms. The accepting states of the new machine willbe the union of the accepting states of Mr and Ms. The two machinesmay make use of different alphabets, and we need to take care over this.Let Mr = (Qr,Σr, δr, q0r, Fr) and Ms = (Qs,Σs, δs, q0s, Fs).

9.33. PROOF SUMMARY 59

Then Mr+s = (Q′,Σ′, δ′, q′0, F′), where:

Q′ = Qr ∪Qs ∪ {S0}, S0 6∈ Qr ∪Qs

Σ′ = Σr ∪ Σs q′0 = S0 F ′ = Fr ∪ Fs

δ′(S0,Λ) = {q0r, q0s}δ′(S0, σ) = {}δ′(S, σ) = δr(S, σ), S ∈ Qr, σ ∈ Σr

δ′(S, σ) = {}, S ∈ Qr, σ 6∈ Σr

δ′(S, σ) = δs(S, σ), S ∈ Qs, σ ∈ Σs

δ′(S, σ) = {}, S ∈ Qs, σ 6∈ Σs

δ′(S,Λ) = δr(S,Λ), S ∈ Qr

δ′(S,Λ) = δs(S,Λ), S ∈ Qs

is an NFA with Λ transitions which accepts Language(r + s).

Induction step: rsAssume that we have automata Mr and Ms which accept Language(r)and Language(s) and show how to combine them into a machine to acceptLanguage(rs).

Informally: we need a way to allow ourselves to go through Mr and thenMs. We start at the start state of Mr. From each accepting state of Mr

we add a Λ transition to the start state of Ms. The accepting states ofthe new machine will be the accepting states of Ms. The two machinesmay make use of different alphabets, and we need to take care over this.

Let Mr = (Qr,Σr, δr, q0r, Fr) and Ms = (Qs,Σs, δs, q0s, Fs).

Then Mr+s = (Q′,Σ′, δ′, q′0, F′), where:

Q′ = Qr ∪Qs Σ′ = Σr ∪ Σs q′0 = q0r F ′ = Fs

δ′(S, σ) = δr(S, σ), S ∈ Qr, σ ∈ Σr

δ′(S, σ) = {}, S ∈ Qr, σ 6∈ Σr

δ′(S,Λ) = δr(S,Λ) ∪ {q0s}, S ∈ Fr

δ′(S,Λ) = δr(S,Λ), S ∈ Qr, S 6∈ Fr

δ′(S, σ) = δs(S, σ), S ∈ Qs, σ ∈ Σs

δ′(S, σ) = {}, S ∈ Qs, σ 6∈ Σs

δ′(S,Λ) = δs(S,Λ), S ∈ Qs

is an NFA with Λ transitions which accepts Language(rs).

9.33 Proof summary

We have shown that we can perform the construction in each of the base casesand in each of the inductive steps. Hence the proof is finished.

We can use this proof to give us an algorithm to recursively construct anNFA with Λ transitions, given a regular expression.


9.34 Example

We take the regular expression a∗ + ba through NFAΛ, NFA, to FA.

a ?>=<89:;−a // ?>=<89:;+

b ?>=<89:;−b // ?>=<89:;+

a∗ GFED@ABC+−a

** ?>=<89:;+Λ

ll

ba ?>=<89:;−b ///.-,()*+ Λ // ?>=<89:;+

a // ?>=<89:;+

a∗ + ba ?>=<89:;+a

** ?>=<89:;+Λ

jj

?>=<89:;−

Λ 66mmmmmm

Λ ((QQQQQQQ

/.-,()*+ b ///.-,()*+ Λ // ?>=<89:;+a // ?>=<89:;+

?>=<89:;+a

** ?>=<89:;+Λ

jj

?>=<89:;−

Λ 66mmmmmm

Λ ((QQQQQQQ

/.-,()*+ b ///.-,()*+ Λ // ?>=<89:;+a // ?>=<89:;+

Eliminate Λ:

/.-,()*+ a //

a

[[?>=<89:;+

a

hh

a

��

GFED@ABC+−

a

<<yyyyyyyy

a((

b

((QQQQQQQQQQQQQQQb

''/.-,()*+b

///.-,()*+a

77/.-,()*+

a// ?>=<89:;+

?>=<89:;2a //

a

YY

GFED@ABC3+

ajj

a

��

ONMLHIJK1+−

a

<<xxxxxxxx

a((

b

))RRRRRRRRRRRRRRRRb

((?>=<89:;4b

//?>=<89:;5

a

66?>=<89:;6 a

// GFED@ABC7+

9.34. EXAMPLE 61

To FA: ONMLHIJK23+

a

��

ONMLHIJK1+−

a88ppppppp

b &&NNNNNNNN

?>=<89:;4b

// GFED@ABC56 a// GFED@ABC7+

ONMLHIJK23+

a

��

ONMLHIJK1+−

a88ppppppp

b &&NNNNNNNN

?>=<89:;4b

// GFED@ABC56 a// GFED@ABC7+

Simplify: ONMLHIJK23+

a

��

ONMLHIJK1+−

a88ppppppp

b &&NNNNNNNN

GFED@ABC56 a// GFED@ABC7+


Chapter 10

Kleene’s theorem

10.1 A Theorem

Theorem:Every language accepted by an FA is generated by a regular expression.

Proof:omitted, due to industrial action.

Once again, the proof is constructive: it gives us an algorithm which, given aFA, constructs a regular expression that generates the same language.

We have now established that, in terms of the languages that they can beused to describe, all of the following are equivalent:

• regular expressions

• finite automata

• non-deterministic finite automata

• non-deterministic finite automata with Λ transitions

10.2 Kleene’s theorem

Previously we gave a definition of a regular language as one which was describedby a regular expressions.

We can re-formulate the equivalence of regular expressions, FA’s and NFA’sas:

Theorem 8 (Kleene)A language is regular iff it is accepted by a FA.

A language is regular iff it is accepted by an NFA.

A language is regular iff it is generated by a regular grammar.

63

64 CHAPTER 10. KLEENE’S THEOREM

Note: iff is an abbreviation of if, and only if.Cohen characterises Kleene’s theorem as:

the most important and fundamental theorem in the theory of finiteautomata

Chapter 11

Closure Properties ofRegular Languages

11.1 Closure properties of regular language

We will now show some closure properties of the set of regular languages. Wewill show that

• the complement of a regular language is regular

• the union of two regular languages is regular

• the concatenation of two regular languages is regular

• the Kleene closure of a regular language is regular

• the intersection of two regular languages is regular

11.2 Formally

If L1 and L2 are regular languages then so are:

• L1

• L1 ∪ L2

• L1L2

• L∗1

• L1 ∩ L2

We use Kleene’s theorem to prove these.

65

66 CHAPTER 11. CLOSURE PROPERTIES OF REGULAR LANGUAGES

11.3 Complement

Theorem 9 If L1 is a regular language, then so is L1.

We have already shown this, in a previous lecture, where we showed that wecould construct a FA to accept L1, given a FA which accepted L1.

If L1 is regular then there is a FA which accepts it. If there is an FA whichaccepts L1 then L1 is regular.

11.4 Union

Theorem 10 If L1 and L2 are regular languages then so is L1 ∪ L2.

The language L1 ∪ L2 is the set of all strings in either L1 or L2.If L1 is regular then there is a regular expression r which describes it.If L2 is regular then there is a regular expression s which describes it.Then the regular expression r + s describes L1 ∪ L2.Since L1 ∪ L2 is described by a regular expression it is regular.

11.5 Concatenation

Theorem 11 If L1 and L2 are regular languages then so is L1L2.

The language L1L2 is the set of all strings which consist of a string from L1

follwed by a string from L2.If L1 is regular then there is a regular expression r which describes it.If L2 is regular then there is a regular expression s which describes it.Then the regular expression rs describes L1L2.Since L1L2 is described by a regular expression it is regular.

11.6 Kleene closure

Theorem 12 If L1 is a regular language then so is L∗1.

The language L∗1 is the set of all strings which are (possibly empty) sequences

of strings in L1.If L1 is regular then there is a regular expression r which describes it.Then the regular expression r∗ describes L∗

1.Since L∗

1 is described by a regular expression it is regular.

11.7 Intersection

Theorem 13 If L1 and L2 are regular languages then so is L1 ∩ L2.

11.8. SUMMARY OF THE PROOFS 67

Note: for any sets A and B: A ∩B = A ∪BSince L1 is regular so is L1 Since L2 is regular so is L2

Since L1 and L2 are regular so is L1 ∪ L2

Since L1 ∪ L2 is regular so is L1 ∪ L2

11.8 Summary of the proofs

We could have performed all these proofs via FA’s, NFA’s or regular grammarsif we had wanted to.

We have now shown that what appear to be very different ways to definelanguages are all equivalent, and moreover that the class of languages that theydefine is closed under various operations. We have also shown that all finitelanguages are regular.

The next obvious question is:

• are there any languages which are not regular?

We will answer this question next.

68 CHAPTER 11. CLOSURE PROPERTIES OF REGULAR LANGUAGES

Chapter 12

Non-regular languages

12.1 Non-regular Languages

So far we have seen one sort of abstract machine, the finite automaton, and thesort of language that this sort of machine can accept, the regular languages.

We will now show that there are languages which cannot be accepted byfinite automata.

Outline of Proof:

Suppose we have a language L and we want to show it is non-regular.

A language is non-regular just when it is not regular.

As a general rule of logic if we wish to show ¬P we assume P and derive acontradiction.

Hence, to show L is not regular we must show that a contradiction followsif we assume that L is regular.

When we are trying to derive a contradiction from the assumption that L isregular we usually make use of Kleene’s theorem.

If L is regular then there is a FA which accepts L.

We have already shown that all finite languages are regular, so if we are toshow that L is non-regular then L had better be infinite.

Note: there are lots of infinite regular languages:e.g. Language((1 + 0)∗).

Not all infinite languages are non-regular, but all non-regular languages areinfinite.

The general technique is to show that there is some string which must beaccepted by the FA, but which is not in L.

A FA has, by definition, a finite number of states.

Any sufficiently long string which is in L will trace a path through any FAwhich accepts L which visits some state more than once.

We then attempt to use this fact to give examples of other strings which anyFA which accepts L will accept, but which are not in L.

However, if there are such strings, then no FA accepts L. The contradictsour assumption that L was regular.

69

70 CHAPTER 12. NON-REGULAR LANGUAGES

Hence L is not regular, i.e. L is non-regular.

Comments on this argument

This argument is perfectly good, but it glosses over the fact that we may haveto do some thinking to show that there are strings which do the trick for us.

And, of course, we may be attempting to show that some regular languageis not regular. In this case we will fail!

12.2 Pumping lemma

Theorem 14 (Pumping lemma) If L is a regular language then there is a

number p, such that (∀s ∈ L)(|s| ≥ p ⊃ s = xyz), where:

1. (∀i ≥ 0)xyiz ∈ L

2. |y| > 0

3. |xy| ≤ p

We call p the pumping length.

(∀s ∈ L)(|s| ≥ p ⊃ s = xyz) reads ‘for every string s in L, if s is at least aslong as the pumping length, then s can be written as xyz.

12.3 Pumping lemma informally

The pumping lemma tells us, that for long enough strings s ∈ L, s can bewritten as xyz such that:

• y is not Λ and

• xz,

• xyz,

• xyyz,

• xyyyz,

• xyyyyz,. . . are all in L.

We say that we can ‘pump’ s and still get strings in L.

12.4. PROVING THE PUMPING LEMMA 71

12.4 Proving the pumping lemma

We now give a formal proof of the pumping lemma.

We have three things to prove corresponding to conditions 1, 2 and 3 in thepumping lemma. Let:

• M = (Q,Σ, δ, q0, F ) be a FA which accepts L,

• p be the number of states in M (i.e. p = Cardinality(Q))

• s = s1s2 . . . sn−1sn be a string in L such that n ≥ p

• r1 = q0

• ri+1 = δ(ri, si), 1 ≤ i ≤ n

Then the sequence r1r2 . . . rnrn+1 is the sequence of states that the machinegoes through to accept s. The last state rn+1 is an accepting state.

This sequence has length n + 1, which is greater than p. The pigeonhole

principle tells us that in the first p+ 1 items in r1r2 . . . rnrn+1 one state mustoccur twice.

We suppose it occurs first as rj and second as rl.

Notice: l 6= j, and l ≤ p+ 1.

Now let:

• x = s1 . . . sj−1

• y = sj . . . sl−1

• z = sl . . . sn

So:

• x takes M from r1 to rj

• y takes M from rj to rj

• z takes M from rj to rn+1

Hence M accepts xyiz, i ≥ 0. Thus we have shown that condition 1 of thepumping lemma holds.

Because l 6= j we know that |y| 6= 0. Thus we have shown that condition 2of the pumping lemma holds.

Because l ≤ p+1 we know that |xy| ≤ p. Thus we have shown that condition3 of the pumping lemma holds.

Hence the pumping lemma holds.


12.5 A non-regular language

As an example we will show that {0n1n|n ≥ 0} is not a regular language.We begin the proof by assuming that this is a regular language, so there is

some machine N which accepts it.Hence, by the pumping lemma, there must be some integer k, such that the

string 0k1k, can be pumped to give a string which is also accepted by N .We let xyz = 0k1k, and show that xyyz is not in {0n1n|n ≥ 0}There are three cases to consider:

1. y is a sequence of 0s

2. y is a sequence of 0s followed by a sequence of 1s

3. y is a sequence of 1s

In case 1 xyyz will have more 0s than 1s, and so xyyz 6∈ L.In case 3 xyyz will have more 1s than 0s, and so xyyz 6∈ L.In case 2 xyyz will have two occurrences of the substring 01, and so xyyz 6∈ L.So in each case the assumption that {0n1n|n ≥ 0} is regular leads to a

contradiction.So {0n1n|n ≥ 0} is not regular.

Comment

Clearly we can write an algorithm to decide whether a string is in {0n1n|n ≥ 0}.This algorithm cannot be represented by a finite state machine. Hence theremust be more powerful abstract machines than FA’s.

12.6 Pumping lemma re-cap

If L is a regular language then there is a number p, such that (∀s ∈ L)(|s| ≥p ⊃ s = xyz), where:

1. (∀i ≥ 0)xyiz ∈ L

2. |y| > 0

3. |xy| ≤ p

12.7 Another non-regular language

We will show that the language

L1 = {w|w has the same number of 0′s as 1′s}

is non-regular.

12.8. AND ANOTHER . . . 73

We start by assuming that L is regular. Then we will show that there is astring in L which cannot be pumped. We let p be the pumping length.

Which string to choose?One candiate is 0p1p, where p is the pumping length. A similar string worked

for us before. However, it appears that this string can be pumped. Suppose wetake x and z to be Λ, and y to be 0p1p.

Then:

xz = Λ

xyz = 0p1p

xyyz = 0p1p0p1p

xyyyz = 0p1p0p1p0p1p

xy . . . yz = . . .

So, no contradiction here.All is not lost, however. Condition 3 in the pumping lemma tells us that

we can restrict attention to the case where |xy| ≤ p, where p is the pumpinglength.

If we split 0p1p under this condition then y must consist of only 0’s.Now xyiz, i ≥ 0 will only have the same number of 0’s and 1’s when i = 1.So the pumping lemma has told us that we must be able to split 0p1p in a

way which leads to a contradiction.So L1 is not regular.

12.8 And another . . .

We will show that the language L2 = {ww|w ∈ {0, 1}∗} is non-regular.Once again we assume that L2 is regular and use the pumping lemma to

obtain a contradition.As usual we let p be the pumping length.We can’t use the string 0p1p because it is not in the language. Why not try

0p0p? It is in the language, and it looks similar.Notice that 0p0p = 02p.Alas however we can find a way to pump 0p0p and stay in L2, even taking

into account condition 3.Suppose we take x to be Λ, and y to be a string of 0’s such that |y| ≤ p and

|y| is even.Then xyiz, i ≥ 0 will always be in L2.So, no contradiction.Note that every word in L2 has two equal left and right parts.What is happening when we pump the string is that we are adding symbols

to the left part.Then, however, we are allowing ourselves to move half of them into the right

part.


This results in a string with two equal left and right parts once again.What we need to do is to make sure that we can’t do this rearrangement.Consider the string 10p10p.The pumping lemma tells us that we should be able to split this string up

into xyz, such that y is just 0’s and xyiz, i ≥ 0 will be in L2.Now we have our contradiction: xyiz is only in L2 for i = 1.

12.9 Comment

The moral of this story is that we will sometimes have to think a bit to find astring which will allow us to find a contradiction.

Chapter 13

Regular languages:summary

13.1 Regular languages: summary

We have covered quite a lot of material in this section of the paper, and mostof it has been done carefully, and in some detail.

This material needs to be presented with some care: all the pieces fit togetherrather neatly, and much of the understanding depends on seeing how the delicatemechanism works.

As we have worked through the material we have taken a ‘bottom-up’ ap-proach, now we will take a top down approach.

13.2 Kleene’s theorem

Kleene’s thoerem is the most important result. Why?

First, Kleene’s theorem relates regular grammars (or regular expressions)and finite automata.

A grammar is a way to generate strings in a language.

An automaton provides us with a way to recognise strings in a language.

Kleene’s theorem neatly relates the strength of the machine with the formof the grammar. It is not at all obvious that such a neat relationship shouldhold.

Second, Kleene’s theorem tells us that the deterministic and the non-deterministicvariants of finite automata have just the same power.

Again it is not at all obvious that this should be the case.

It is important not just to know Kleene’s theorem as a fact, but also to knowwhy Kleene’s theorem holds. In other words we need to know how the proofsgo. Why?

75

76 CHAPTER 13. REGULAR LANGUAGES: SUMMARY

First, the regular languages are only one class of language. There are similarresults about other classes of language. Understanding how we showed Kleene’stheorem helps us understand the properties of these classes too.

Second, the proofs actually let us construct recognisers from generators,which turns out to be useful in itself.

Third, much of the mathematics that is used in these proofs is used elsewherein computer science.

In fact the bulk of this section of the course was devoted to setting up themechanism required to prove Kleene’s theorem.

Part II

Context Free Languages

77

Chapter 14

Introducing Context FreeGrammars

14.1 Beyond Regular Languages• Many languages of interest are not regular.

– {anbn | a ∈ N} is not regular.

FA cannot “count” higher than the number of states.

{ααR | α ∈ A∗} is not regular when |A| > 1.

FA can’t match symbols in α with those in αR

– Arithmetic expressions with brackets is not regular.

FA can’t check that the brackets match

Now consider Context Free Languages,defined using Context Free Grammars.

• What is a Context Free Grammar (CFG)?

• How can we define languages using CFGs?

• How can we recognise Context Free Languages?

• What is the relationship between Context Free Languages and RegularLanguages?

14.2 Sentences with Nested Structure

2 + 3 × ( 4 + 5 × 7 )

the boy hit the big ball

79

80 CHAPTER 14. INTRODUCING CONTEXT FREE GRAMMARS

while c do while d do S

if (c) if (d) S; else T ;

if (c) if (d) S; else T ;

14.3 A Simple English GrammarA sentence is a noun phrase followed by a verb

phrase.S → NP VP

A noun phrase is a determiner followed by either anoun or an adjective phrase.

NP → D NNP → D AP

An adjective phrase is an adjective foll. by a noun. AP → A N

A verb phrase is a verb followed by a noun phrase. VP → V NP

A determiner is an article, e.g. “the”. D → the

A noun is a word denoting an object, e.g. “ball” or“boy”.

N → boyN → ball

An adjective is a word denoting a property, e.g.“big”.

A → big

A verb is a word denoting an action, e.g. “hit”. V → hit

14.4. PARSE TREES 81

14.4 Parse Trees

the boy hit the big ball

D N V D A N

AP

NP

NP VP

S�

��

PPPPP

��

��

��

QQ

��

��

QQ

� @

��

��

BBBBBB

14.5 Context Free Grammars

Components of a grammar:

• Terminal symbols: “the”, “boy”, “ball”, etc.The words which actually appear in sentences.

• Nonterminal symbols : “S”, “NP”, “VP”, “D”, “N ” etc.Names for components of sentences.Never appear in sentences.Distinguished nonterminal (S) identifies the language being defined.

• A finite set of Productions.

A production has the form:

nonterminal → definition

where definition is a string (possibly empty) of terminal and/or nonterminalsymbols.

“→” is a metasymbol.It is part of the notation (metalanguage).

82 CHAPTER 14. INTRODUCING CONTEXT FREE GRAMMARS

14.6 Formal definition of CFG

A Context Free Grammar (CFG) is a 4-tuple G = (Σ, N, S, P ) where:

• Σ is a finite set of terminal symbols (the alphabet).

• N is a finite set of nonterminal symbols, disjoint from Σ.

• S is a distinguished member of N , called the start symbol.

• P is a finite set of production rules of the form α→ β, where:

– α is a nonterminal symbol: α ∈ N ,

– β is a (possibly empty) string of terminal and/or nonterminal sym-bols: β ∈ (N ∪ Σ)∗.

Note: Some presentations of context free grammars do not allow rules withempty right-hand sides. We will discuss this restriction later.

Chapter 15

Regular and Context FreeLanguages

15.1 Regular and Context Free Languages

Kleene’s theorem tells us that all the following formalisms are equivalent inpower:

• Regular expressions

• (Deterministic) Finite Automata

• Nondeterministic Finite Automata

• Nondeterministic Finite Automata with Λ Transitions

Now we have a new formalism: the Context Free Grammar.

How does its power compare with those above?

How is the class of Context Free Languages related to the class of regularlanguages?

CF = Reg? CF ⊆ Reg? CF ⊇ Reg? CF ∩ Reg = ∅?

15.2 CF ∩ Reg 6= ∅

The language EVEN-EVEN contains every string over the alphabet {a, b} withan even number of as and an even number of bs.

We saw earlier that it may be described by a r.e.:EVEN-EVEN = Language((aa + bb + (ab + ba)(aa + bb)∗(ab + ba))∗)

83

84 CHAPTER 15. REGULAR AND CONTEXT FREE LANGUAGES

Here is a context-free grammar for EVEN-EVEN:

S → S S B → aaS → B S B → bbS → S B U → abS → Λ U → baS → U S U

(Proof : Cohen, p236)

So, at least one language is both context free and regular.

15.3 CF 6⊆ Reg

The language EQUAL contains every string over the alphabet {a, b} with anequal number of as and bs.

We proved earlier that EQUAL is not regular (pumping lemma).

Here is a context-free grammar for EQUAL:

S → aB A→ a B → bS → bA A→ aS B → bS

A→ bAA B → aBB

(Proof : Cohen, p239)

So, at least one language is context free but is not regular.

15.4 CF ⊇ Reg

In fact, every regular language is context free. To prove this, we show how toconvert any FA into a context free grammar.

(We could have chosen to convert regular expressions, or NFAs into CFGsinstead, since Kleene showed us they were all equivalent.)

• The alphabet Σ of the CFG is the same as the alphabet Σ of the FA.

• The set N of nonterminals of the CFG is the set Q of states of the FA.

• The start symbol S of the CFG is the start state q0 of the FA.

• For every X,Y ∈ Q, a ∈ Σ, there is a production X → aY in the CFG ifand only if there is a transition X

a−→ Y in the FA.

• For every X ∈ F in the FA there is a production X → Λ in the CFG.

(Proof : Cohen p260)

15.5. EXAMPLE 85

15.5 Example

ONMLHIJKA−,+

a

��

b// GFED@ABCB+

b//

a

EE

?>=<89:;C b,att GFED@ABCD+ b,a

mm

The following CFG accepts the same language as the above FA:

Σ = {a, b}

N = {A,B,C,D}

S = A

P =

A→ aD B → aD C → bC D → aDA→ bB B → bC C → aC D → bDA→ Λ B → Λ D → Λ

15.6 Regular Grammars

Grammars produced by the above process are called regular grammars.Every production of a regular grammar has one of the forms:

N1 → T N2

N1 → T

(where N1, N2 ∈ N , and T ∈ Σ∗).That is, the RHS of every production consists of a (possibly empty) sequence

of terminals, possibly followed by a single nonterminal.To prove that the class of languages accepted by regular grammars is exactly

the class of regular languages, we need to show how to transform any regulargrammar into an equivalent FA.

The transformation is similar to the one (which we omitted) turning FAsinto regular expressions. See Cohen (p263) if you are interested in it.

15.7 Parsing using Regular Grammars

Because regular grammars are so much like finite automata, it is easy to generate(or parse) a sentence using a regular grammar.

S → aS (1) T → bT (3) U → aU (5)S → T (2) T → bU (4) U → a (6)

To generate the string aabba, we can go through the following steps:

S1⇒ aS

1⇒ aaS

2⇒ aaT

3⇒ aabT

4⇒ aabbU

6⇒ aabba

This is called a derivation.


15.8 Derivations

Consider the CFG (Σ, N, S, P ).

A sentential form is a (possibly empty) string made up of nonterminals andterminals: that is, a string of type (Σ ∪N)∗.

A derivation is a sequence α0 ⇒ · · · ⇒ αn in which:

• Each αi is a sentential form.

• α0 is the start symbol, S.

• αn is a string of type Σ∗ (i.e., there are no nonterminals left)

• For each pair αi ⇒ αi+1, we have:

– αi has the form β n δ for some nonterminal n ∈ N and sententialforms β and δ;

– there is a production (n → γ) ∈ P ; and

– αi+1 has the form β γ δ.

15.9 Derivations in Regular Grammars

A semiword is a sentential form of the restricted form Σ∗N : that is, a (possiblyempty) string of terminals followed by a single nonterminal.

In a regular grammar, every production has as its right-hand side either asemiword or a word.

In any derivation, α0 is a semiword: it is a single nonterminal.

If some sentential form αi is a semiword, and we use a production from aregular grammar, αi+1 will also be either a semiword or a word.

Thus, to find a derivation using a regular grammar, we simply select a pro-duction whose left-hand-side matches the (single) nonterminal in our sententialform, and repeat until a rule with no nonterminals on its right-hand side is used,at which stage the result is a word.

15.10 Derivations in Arbitrary CFGs

With regular grammars, every sentential form in a derivation is a semiword.

With arbitrary CFGs, productions can have multiple nonterminals on theright-hand side, and so sentential forms in derivations have multiple nontermi-nals too. Further, any terminals need not occur first in the sentential forms.

It is no longer a matter of selecting a production to match our nonterminal;we must first decide which nonterminal to match.

15.11. PARSE TREES 87

(1) (2) (3) (4) (5) (6)S → XS Y S → a X → bX X → c Y → d Y → e

S1⇒ XSY S

1⇒ XSY

3⇒ bXSY

1⇒ XXSY Y

4⇒ bcSY

5⇒ XXSdY

1⇒ bcXSY Y

3⇒ XbXSdY

3⇒ bcbXSY Y

3⇒ bXbXSdY

3⇒ bcbbXSY Y

2⇒ bXbXadY

4⇒ bcbbcSY Y

3⇒ bXbbXadY

2⇒ bcbbcaY Y

4⇒ bcbbXadY

5⇒ bcbbcadY

6⇒ bcbbXade

6⇒ bcbbcade

4⇒ bcbbcade

The leftmost nonterminal in a sentential form is the first nonterminalthat we encounter when scanning left to right.

A leftmost derivation of word w from a CFG is a derivation in which, ateach step, a production is applied to the leftmost nonterminal in the currentsentential form.

Of the two derivations of the word bcbbcade on the previous slide, the firstis a leftmost derivation; the second is not.

Theorem: Any word that can be generated from a given CFG by some deriva-tion can be generated by a leftmost derivation.

15.11 Parse Trees

We saw parse trees informally in our first lecture.We now want to define parse trees more formally, by looking at the relation-

ship to derivations.Given G = (Σ, N, S, P ) and w ∈ V ∗

T , a parse tree for w from G is an ordered,labelled tree such that:

• Each leaf node is labelled with an element of Σ.

• Each non-leaf node is labelled with an element of N .

• The root is labelled with the start symbol, S.

• For each non-leaf node, n, if α is the label on n and γ1, · · · , γk are thelabels on its children, then α→ γ1 · · ·γk is a rule in P .

• The fringe of the tree is w.


15.12 Derivations and parse trees

A partial parse tree is an ordered, labelled tree which is like a parse tree,except that it may have nonterminals or terminals at its leaves.

The fringe of a partial parse tree is a sentential form.A derivation α0 ⇒ · · · ⇒ αn corresponds to a sequence of partial parse trees

t0, . . . , tn such that the fringe of ti is αi, and each ti+1 is obtained from t byreplacing a single leaf node (labelled with a nonterminal symbol n) by a treewhose root is n.

Since αn ∈ Σ∗, the final partial parse tree tn is a parse tree for the word αn,corresponding to the derivation α0, . . . , αn.

15.13 AmbiguityA word may have several different parse trees, each corresponding to a differentleftmost derivation.

(1) (2) (3) (4) (5)S → XY X → b X → Xa Y → b Y → aY

S1⇒ XY2⇒ bY5⇒ baY5⇒ baaY4⇒ baab

S1⇒ XY3⇒ XaY2⇒ baY5⇒ baaY4⇒ baab

S1⇒ XY3⇒ XaY3⇒ XaaY2⇒ baaY4⇒ baab

A grammar in which there are sentences with multiple parse trees is calledambiguous.

(1) (2) (3, 4, 5)E → E“+”E E → E“×”E E → “1”|“2”|“3”

1 + 2 × 3

E E E� @

E�

��

� @E

��

1 + 2 × 3

E E E� @

E��

SS

S

E

CCCCC

It is often possible to transform an ambiguous grammar so that it unam-biguously defines the same language.

(1) (2) (3) (4) (5, 6, 7)E → E“+”T E → T T → T “×”F T → F F → “1”|“2”|“3”

1 + 2 × 3

F F F

�

JJT T

E T�� PPE

��

1 + 2 × 3

F F F

T T

E��

JJ

E

??????

Chapter 16

Normal Forms

16.1 Lambda Productions

It is sometimes convenient to include productions of the form A→ Λ in a CFG.For example, here are two CFGs, both defining the language described by

the regular expression a∗b+:

S → AB S → ABA→ Λ S → BA→ aA A→ aAB → b A→ aB → bB B → b

B → bB

The grammar on the left is shorter and easier to understand; however, thelambda production A→ Λ causes problems for parsing.

16.2 Eliminating Λ Productions

Suppose a CFG has a production A→ Λ, which we wish to remove.Suppose that A is not the start symbol.Any derivation that uses this production must also have used some produc-

tion B → β A δ, for some nonterminal B (possibly A itself).If we add to the grammar the single production B → β δ, this instance of

the Lambda production is unnecessary, but the language accepted is the same.If we carry out that process for every occurrence of A on the right-hand side

of any production, we may eliminate A→ Λ altogether.For example, we take:

S → AB, A→ Λ, A→ aA, B → b, B → bB

and, noting that A occurs on the right-hand side of two productions, we add:

S → B, A→ a

89

90 CHAPTER 16. NORMAL FORMS

Now, A → Λ is redundant, and we may remove it; the result is as givenbefore.

However, the process may run into some problems.

16.3 Circularities

First, the process of eliminating a Λ production may introduce new ones. Con-sider:

S → aTU, T → Λ, T → U, U → T

This is a (rather long-winded) grammar for the language consisting of just thestring a.

To remove T → Λ, we must add S → aU and U → Λ; the result is:

S → aTU, S → aU, T → U, U → T, U → Λ

Then, to remove U → Λ, we must add S → aT and T → Λ: and so we areno better off than where we started.

The solution is to remove the potential Lambda productions for T and Uconcurrently.

That is, we note that U , though not directly defined by a Lambda produc-tion, is nullable: there is a derivation

U ⇒ · · · ⇒ Λ

T is trivially nullable, because it has a Lambda production of its own.

We must add new productions for every possible nullable nonterminal on theright-hand side of any production.

S → aTU, T → Λ, T → U, U → T

T and U are both nullable, so we must add:S → aU , accounting for T ⇒ Λ;S → aT , accounting for U ⇒ Λ; andS → a, accounting for TU ⇒ Λ.

Now, we no longer need the Lambda productions, so we get

S → aTU, S → aU, S → aT, S → a, T → U, U → T

Aside: At this point, we could note that T and U are useless, since no derivationinvolving either of them can terminate.

They can thus be removed, leaving just S → a.

16.4. Λ PRODUCTIONS ON THE START SYMBOL 91

16.4 Λ Productions on the Start Symbol

Consider the CFG:

S → Λ, S → aS

It accepts the language described by the r.e. a∗.

If we attempt to remove S → Λ, we note that S is nullable, and add S → a,resulting in:

S → a, S → aS

However, this CFG now accepts Language(a+): the empty string is no longerallowed.

The same problem will arise any time S is nullable, not just when it isdirectly involved in a Lambda production.

We can make sure that the start symbol is never subjected to Lambda re-moval by first transforming our grammar so that the start symbol occurs exactlyonce, on the left-hand side of the first production.

Let G = (Σ, N, S, P ) be a CFG, with

{(S → α1), (S → α2), . . . , (S → αn)} ⊆ P

being all the productions for S; let S′ 6∈ N be a brand new nonterminal.

The CFG: G′ = (Σ, N ∪{S′}, S′, P ∪{S′ → S}) is equivalent to G, and maybe safely subjected to Lambda removal.

If S was nullable, the additional production S′ → Λ will be generated; ofcourse, this Lambda production must not be removed.

16.5 Example

S → cS, S → TU, T → Λ, T → aT, U → Λ, U → bTU

accepts the language L = Language(c∗a∗(ba∗)∗). Note that Λ ∈ L.

First introduce a new start symbol S′:

S′ → S, S → cS, S → TU, T → Λ, T → aT, U → Λ, U → bTU

Now, note that S, T , and U are all nullable, so add:

S′ → Λ, S → c, S → T, S → U,T → a, U → bT, U → bU, U → b

The result (with Λ productions removed) also accepts L.

92 CHAPTER 16. NORMAL FORMS

16.6 Unit Productions

Definition: A unit production is a production n1 → n2, where n1, n2 ∈ N .For example, consider S → A B, A→ B, B → b.The grammar accepts the language {bb}; the A nonterminal, with its unit

production A→ B, is merely a distraction, and the grammar could more infor-matively be written S → A A, A→ b.

Any leftmost derivation using a unit production n1 → n2 must include sen-tential forms σ1 = α n1 β and σ2 = α n2 β (α ∈ Σ∗, β ∈ (Σ ∪N)∗).

Since n2 is itself a nonterminal, the derivation must also include σ3 = α γ β,where (n2 → γ) ∈ P .

If we add the production n1 → γ, the unit production becomes unnecessary,and the derivation can go directly from σ1 to σ3.

However, once again we can get circularities.For example, if P contains both n1 → n2 and n2 → n1, we will indefinitely

replace one by the other.Instead, we use the following rule:

For every pair of nonterminals n1 and n2 such that n1 ⇒ · · · ⇒ n2,and for every nonunit production n2 → γ, add the production n1 →γ.

As long as all such replacements are done simultaneously, the unit produc-tions may safely be eliminated.

See Cohen (pp273ff) for discussion and an example.

16.7 Example

We continue with the example from Slide 36.

S′ → S | Λ S → cS | TU | c | T | UT → aT | a U → bTU | bT | bU | b

Direct unit productions are S′ → S, S → T , S → U .Indirectly, we also have S′ ⇒ T , S′ ⇒ U .

From S′ ⇒ S with S → cS | TU | c we get S′ → cS | TU | c.From S′ ⇒ T with T → aT | a we get S′ → aT | a.From S′ ⇒ U with U → . . . we get S′ → bTU | bT | bU | b.From S ⇒ T with T → aT | a we get S → aT | a.From S ⇒ U with U → . . . we get S → bTU | bT | bU | b.In summary:

S′ → Λ | cS | TU | c | aT | a | bTU | bT | bU | bS → cS | TU | c | aT | a | bTU | bT | bU | bT → aT | aU → bTU | bT | bU | b

Chapter 17

Recursive Descent Parsing

17.1 Recognising CFLs (Parsing)How can we determine whether a string w ∈ Σ∗ is in the language generated bya CFG, G = (Σ, N, S, P )?

We know that w ∈ L(G) iff there is a derivation for w from S – equivalently,if there is a parse tree with root S and fringe w.

To determine whether w ∈ L(G), we try to build a parse tree for w.

Top-down: Build parse tree starting from the root. At each step, choose:– A nonterminal, N , in the fringe to expand.– A rule with N as its lhs to apply.

Bottom-up: Build parse tree starting from the leaves and working upwards tothe root. At each step, choose:– A substring, α, of the current string to reduce.– A rule with α as its rhs to apply.

17.2 Top-down Parsing

Consider the following grammar for nested lists of numbers:

L → “(” “)” | “(” B “)” (1,2)B → E | E “,” B (3,4)E → L | 1 | 2 | 3 (5,6,7,8)

Let’s try to parse the input (1,()).L

( 1 , ( ) )

93

94 CHAPTER 17. RECURSIVE DESCENT PARSING

17.3 Top-down Parsing

Here is the leftmost derivation:

L⇒ ( B )⇒ ( E , B )⇒ ( 1 , B )⇒ ( 1 , E )⇒ ( 1 , L )⇒ ( 1 , ( ) )

Working left to right: At each step, either expand the leftmost non-terminal,or match the leftmost terminal with an input symbol.

This is called LL(1) parsing.

17.4 Recursive Descent Parsing

Recursive Descent is a technique for building “one-off” LL(1) parsers.

• Parser is a set of mutually recursive procedures, with one procedure cor-responding to each non-terminal.

• Each procedure decides which rule to use and looks for the symbols in therhs of that rule.

• Matches the rhs from left to right.

• Must be able to choose rule by looking at next input.

17.5 Building a Parser for Nested Lists

• LL(1) grammar for nested lists:

List → “(” RestList (1)RestList → “)” | ListBody “)” (2), (3)ListBody → ListElt RestBody (4)RestBody → Λ | “,” ListBody (5), (6)ListElt → number | List (7), (8)

• Parser will have one procedure for each nonterminal: ParseList , ParseRestList ,etc.

• Input is a sequence of symbols, ss, recognised by a scanner, which alsoremoves white space.

The symbols used are:

17.6. PARSER FOR NESTED LISTS 95

Symbol Kind Symbollparensym “(”rparensym “)”commasym “,”numbersym (0|1|2|3|4|5|6|7|8|9)+

17.6 Parser for Nested Lists

Rule: List→ “(” RestList (1)

procedure ParseList (in out ss);begin

if head(ss) = lparensym thenss := tail(ss); ParseRestList(ss)

else Errorend ParseList

Rule: RestList→ “)” | ListBody “)” (2), (3)

procedure ParseRestList (in out ss);begin

if head(ss) = rparensym thenss := tail(ss)

elseParseListBody(ss);if head(ss) = rparensym then ss := tail(ss) else Error

end ParseList

Rule: ListBody → ListElt RestBody (4)

procedure ParseListBody(in out ss);begin

ParseListElt(ss);ParseRestBody(ss)

end ParseListBody

Rule: RestBody → Λ | “,” ListBody (5), (6)

procedure ParseRestBody(in out ss);begin

if head(ss) = commasym thenss := tail(ss);ParseListBody(ss)

elseskip

end ParseRestBody

Rule: ListElt→ number | List (7), (8)


procedure ParseListElt(in out ss);begin

if head(ss) = numbersym thenss := tail(ss);

elseParseList(ss)

end ParseListElt

17.7 Building a Parse Tree

How can we construct the parse tree?

• Insert code to collect the components corresponding to the RHS of therule applied.

• Add code to “apply rule” at end of code that checks each rule.

• Scanner returns symbol value as well as symbol kind.

• Parser procedure returns tree as well as consuming input.

17.8 Parser for Nested Lists

procedure ParseList(in out ss; out t);begin

if head(ss) = lparensym thenss := tail(ss);ParseRestList(ss, u);t := Tree(“List”, 〈Leaf(lparensym), u〉)

elseError

end ParseList

procedure ParseRestList(in out ss; out t);begin

if head(ss) = rparensym thenss := tail(ss);t := Tree(“RestList”, 〈Leaf(rparensym)〉)

elseParseListBody(ss, u);if head(ss) = rparensym then

ss := tail(ss);t := Tree(“RestList”, 〈u, Leaf(rparensym)〉

else Errorend ParseRestList

17.9. LL(1) GRAMMARS 97

procedure ParseListBody(in out ss; out t);begin

ParseListElt(ss, u);ParseRestBody(ss, v);t := Tree(“ListBody”, 〈u,v〉)

end ParseListBody

procedure ParseRestBody(in out ss; out t);begin

if head(ss) = commasym thenss := tail(ss);ParseListBody(ss, u);t := Tree(“RestBody”, 〈Leaf(commasym), u〉)

elseskip;t := Tree(“RestBody”, 〈〉)

end ParseRestBody

procedure ParseListElt(in out ss; out t);begin

if head(ss).type = numbersym thent := Tree(“ListElt”, 〈Leaf(head(ss).value)〉);ss := tail(ss)

elseParseList(ss, u);t := Tree(“ListElt”, 〈u〉)

end ParseListElt

17.9 LL(1) Grammars

Recursive Descent is an LL(1) parsing technique:

• Leftmost derivations

• Left-ro-right scanning of input

• 1 symbol lookahead.

This only works for some grammars:

Requirement 1:

Two productions for the same nonterminal cannot produce strings thatstart with the same terminal.

Requirement 2:

If a nonterminal can produce Λ, it cannot start with any terminal thatcan also follow it.


17.10 First and Follow sets

For any grammar (Σ, N, S, P ) and sentential form α ∈ (Σ ∪N)∗:

first(α) = {x ∈ Σ | α⇒ xψ for some ψ ∈ Σ∗}follow (α) = {x ∈ Σ | S ⇒ φαxψ for some φ, ψ ∈ (Σ ∪N)∗}

That is, first(α) is all those terminals that can appear first in any stringderived from α, andfollow (α) is all those terminals that can appear immediately after α in anysentential form derived from the start symbol.

Now, the LL(1) requirements are:Requirement 1: If N → α and N → β, then first(α) ∩ first(β) = ∅.Requirement 2: If N ⇒ Λ, then first(N) ∩ follow (N) = ∅.

Consider the grammar for arithmetic expressions:

E → T | E + T | E − T (1), (2), (3)T → F | T × F | T/F (4), (5), (6)F → id | (E) (7), (8)

first(id) = {id}first(“(”E“)”) = {“(”}first(F ) = first(T × F ) = first(T/F ) = {id, “(”}first(T ) = first(E + T ) = first(E − T ) = {id, “(”}

Requirement 1 is satisfied for productions 7,8, but violated for productions1,2,3 and 4,5,6.

Requirement 2 is trivially satisfied (no Λ productions).

17.11 Transforming CFGs to LL(1) form

Turn left-recursion into right recursion.

For example:

E → T | E + T | E − TT → F | T × F | T/FF → id | (E)

;

E → T | T + E | T − ET → F | F × T | F/TF → id | (E)

Left factor.

N → αβN → αγN → δ

;

N → αN ′

N → δN ′ → β | γ

where N ′ is a new nonterminal.

17.11. TRANSFORMING CFGS TO LL(1) FORM 99

E → T | T + E | T − ET → F | F × T | F/TF → id | (E)

;

E → TE′

E′ → Λ | + E | − ET → FT ′

T ′ → Λ | × T | /TF → id | (E)

Now Requirement 1 is satisfied by all nonterminals.But now E′ and T ′ are nullable, so we need to check Requirement 2:

first(E′) = {“+”, “−”} first(T ′) = {“×”, “/”}follow (E′) = follow (E) = {“)”}follow (T ′) = follow (T ) = first(E′) = {“+”, “−”}

first(E ′) ∩ follow (E′) = ∅ first(T ′) ∩ follow (T ′) = ∅ OK!

Can’t always turn a CFG into LL(1) form.

• Some CFLs can’t be parsed deterministically.

E.g. { wwR | w ∈ {a, b}∗ }: S → Λ | a S a | b S b

Breaks LL(1) requirement (2), because S ⇒ Λ but first(S) = follow (S) ={a, b}.

If parser sees a as next symbol, cannot decide whether to do Sa⇒ Λa = aor Sa⇒ aSaa.

• Some CFLs can be parsed deterministically bottom up, but not top down.

E.g. { ax+y bx cy | x, y ≥ 0 }: S → Λ | a T b | a S c T → Λ | a T b

Breaks LL(1) condition (1). Can’t be left-factored. Top down parser can’ttell whether initial a will eventually match a b or a c.


Chapter 18

Pushdown Automata

18.1 Finite and Infinite Automata

L0 = {ambn | m,n ≥ 0} is regular: a∗b∗ is its regular expression.

ONMLHIJKS−0 Λ

//

a

MM

ONMLHIJKS+1

b

MM

L1 = {anbn | n ≥ 0} is not regular (pumping lemma), but it is context free:S → aSb | Λ is its CFG.

If we augment L0’s NFAΛ with a counter, a “super-NFA” can recognizeL1:

ONMLHIJKS−0 Λ

//

a; c:=c+1

MM

ONMLHIJKSc=01

b; c:=c−1

LL

L2 = {w1w2 | w1, w2 ∈ {a, b}∗} is regular: (a + b)∗(a + b)

∗is its regular

expression.

ONMLHIJKS−0 Λ

//

a

b

MM

ONMLHIJKS+1

a

b

MM

L3 = {wwR | w ∈ {a, b}∗} is not regular, but it is context free: S →aSa | bSb | Λ is its CFG.

If we augment L2’s NFAΛ with a stack, the machine can recognize L3:

ONMLHIJKS−0 Λ

//

a; s.push(a)

b; s.push(b)

MM

ONMLHIJKSs.isempty()1

a; s.pop()==a

��

b; s.pop()==b

OO

101

102 CHAPTER 18. PUSHDOWN AUTOMATA

A Pushdown Automaton is simply a NFA augmented with a stack.

18.2 Pushdown Automata

A PDA is a NFAΛ with a stack.

At each transition, we can:

• read a symbol x;

• pop a symbol y from the stack; and

• push a symbol z onto the stack.

We draw this as follows:

?>=<89:;x;y;z

// ?>=<89:;

Any of x, y, z may be Λ, indicating that nothing is read, popped, or pushedat that transition.

We label start states with − and final states with +, as before; but this time,a final state is accepting only if the stack is empty.

Our NPDA for L3 = {wwR} is now written:

ONMLHIJKS−0 Λ;Λ;Λ

//

a;Λ;a

b;Λ;b

MM

ONMLHIJKS+1

a;a;Λ

b;b;Λ

MM

A stack may also serve as a counter, by stacking and matching some arbitrarysymbol (say #); so L1 = {anbn} is:


//

a;Λ;#

ONMLHIJKS+1

b;#;Λ

L4 = {anb2n | n ≥ 0} (S → aSbb | Λ)

ONMLHIJKS−0

a;Λ;#++

Λ;Λ;Λ

BB

GFED@ABCS1

Λ;Λ;#

kkONMLHIJKS+

2

b;#;Λ

18.3. FORMAL DEFINITION OF PDA 103

L5 = {anb⌊n/2⌋ | n ≥ 0} (S → aT | T T → aaT b | Λ)

ONMLHIJKS−0

Λ;Λ;Λ//

a;Λ;#

ONMLHIJKS+1

Λ;#;Λ++ONMLHIJKS+

2b;#;Λ

kk

L6 = {ambncm+n | m,n ≥ 0} (S → aSc | T T → bT c | Λ)

ONMLHIJKS−0

Λ;Λ;Λ//

a;Λ;#

GFED@ABCS1Λ;Λ;Λ

//

b;Λ;#

ONMLHIJKS+2

c;#;Λ

L7 = {ambncm−n | m ≥ n ≥ 0} (S → aSc | T T → aT b | Λ)

ONMLHIJKS−0

Λ;Λ;Λ//

a;Λ;#

ONMLHIJKS+1

Λ;Λ;Λ//

b;#;Λ

ONMLHIJKS+2

c;#;Λ

18.3 Formal Definition of PDA

P = (Σ,Γ, Q, q0, F, δ)

• Σ is the alphabet of input symbols, which can appear in sentences recog-nized by the PDA.

• Γ is the alphabet of symbols that can appear on the stack: may or maynot be the same as Σ.

• Q is the set of states.

• q0 ∈ Q is the start state.

• F ⊆ Q is the set of final states.

• δ is the transition function, which will (nondeterministically) map a state,an input symbol (or Λ), and a stack symbol (or Λ) to a new state and anew sequence of stack symbols:

δ : Q× (Σ ∪ {Λ})× (Γ ∪ {Λ}) → 2Q×Γ∗


18.4 Deterministic and Nondeterministic PDAs

PDAs may be deterministic or nondeterministic, much like finite acceptors.A deterministic PDA is one in which every input string has a unique path

through the machine.This means that at each state, it must be possible to deterministically decide

whether to take a transition that:

• reads a symbol from the input, but pops nothing from the stack (x; Λ; ?)

• reads no input, but pops a symbol from the stack (Λ; y; ?);

• both reads and pops simultaneously (x; y; ?); or

• neither reads nor pops (Λ; Λ; ?).

Consider again the PDA for L6 = {ambncm+n}:

ONMLHIJKS−0

Λ;Λ;Λ//

a;Λ;#

GFED@ABCS1Λ;Λ;Λ

//

b;Λ;#

ONMLHIJKS+2

c;#;Λ

Applying the same algorithm as we used for NFAΛ → FA:

ONMLHIJKS

−

+0

a;Λ;#//

b;Λ;#

��

c;#;Λ

��

ONMLHIJKS+012

a;Λ;#

b;Λ;#

//

c;#;Λ

��~~~~

~~~~

~~

ONMLHIJKS+12

b;Λ;#pp

c;#;Λ

wwnnnnnnnnnnnnnnnnnnn

ONMLHIJKS+2

c;#;Λ//

The resulting PDA is deterministic, and generates L6.However, this does not happen with L3 = {wwR}:


//

a;Λ;a

b;Λ;b

MM

ONMLHIJKS+1

a;a;Λ

b;b;Λ

MM

−→ WVUTPQRSS−,+0

a;Λ;a

!!a;a;Λ++

b;Λ;b

33

b;b;Λ

==

a;Λ;a

b;Λ;b

MM

ONMLHIJKS+1

a;a;Λ

b;b;Λ

MM

ւ

ONMLHIJKS−+0

a;Λ;a

b;Λ;b,,

a;a;Λb;b;Λ !!B

BBBB

BBBB

BONMLHIJKS+

01

a;a;Λb;b;Λ~~}}

}}}}

}}}

a;Λ;ab;Λ;b

ss

ONMLHIJKS+1

a;a;Λ77

b;b;Λgg

18.5. CFG ⊆ PDA 105

This language cannot be parsed deterministically.

18.5 CFG ⊆ PDA

Every language generated by a CFG may be accepted by a PDA.

The proof is by construction. We will in fact show how to build two differentPDAs (corresponding to top-down and bottom-up parsers) for every CFG.

For both constructions, we suppose we have a CFG G = (Σ, N, S, P ) and wewill construct a PDA P = (Σ,Γ, Q, q0, F, δ).

18.6 Top-Down construction

Σ = Σ Γ = N ∪ Σ Q = {0, 1} q0 = 0 F = {1}

• δ(0,Λ,Λ) 7→ (1, S)

• For each x ∈ Σ, δ(1, x, x) 7→ (1,Λ) (match)

• For each (X → α) ∈ P , δ(1,Λ, X) 7→ (1, α) (expand)

GFED@ABC0−Λ;Λ;S

// GFED@ABC1+

x;x;Λ (match)

��

Λ;X;α (expand)

RR

18.7 S → aS | T T → b | bT

GFED@ABC0−Λ;Λ;S

// _^]\XYZ[1+

a;a;Λ

��

b;b;Λ

��

Λ;S;aS 11

Λ;S;T

II

Λ;T ;b

TT Λ;T ;bT

hh

state input stack (top to left)0 abb Λ1 abb S expand1 abb aS match1 bb S expand1 bb T expand1 bb bT match1 b T expand1 b b match1 Λ Λ accept


18.8 Bottom-Up construction

Σ = Σ Γ = N ∪ Σ Q ⊇ {qp, qf} q0 = qp F = {qf}

• For each x ∈ Σ, δ(qp, x,Λ) 7→ (qp, x) (shift)

• For each (X → α) ∈ P , where α = α0, . . . , αn:

– create new states {q1, . . . , qn};

– δ(qp,Λ, αn) 7→ (qn,Λ)

– δ(qn,Λ, αn−1) 7→ (qn−1,Λ)

– · · ·

– δ(q1,Λ, α0) 7→ (qp, X) (reduce)

• δ(qp,Λ, S) 7→ (qf ,Λ)

18.9 S → aS | T ; T → b | bT

ONMLHIJKq+f

�� Λ;a;S

11_^]\XYZ[q−p

a;Λ;a

��

b;Λ;b

Λ;S;Λww

Λ;T ;S

GGΛ;T ;Λ

55

Λ;b;T

QQ

Λ;S;Λ

??��

�� Λ;b;Tqq

state input stack (top to left)qp abb Λ shiftqp bb a shiftqp b ba shiftqp Λ bba reduceqp Λ Tba reduceqp Λ Ta reduceqp Λ Sa reduceqp Λ S acceptqf Λ Λ

18.10 PDA ⊆ CFG

Every language accepted by a PDA may be generated by a CFG.We must show how to construct a CFG G = (Σ, N, S, P ) from an arbitrary

PDA P = (Σ,Γ, Q, q0, F, δ).To make the construction simpler, we suppose:

• |F | = 1, i.e. P has only one accepting state. If |F | > 1, add new statesq′, qf /∈ Q, put F = {qf}, for some new stack symbol y′ /∈ Γ, and for each

q ∈ F , add transitions ?>=<89:;q Λ;Λ;y′

//GFED@ABCq′Λ;y′;Λ

//GFED@ABCqf .

• Every transition either pushes one stack symbol, or pops one stack symbol,but not both.

18.11. PDA TO CFG FORMALLY 107

– Replace any transition ?>=<89:;q1 x;y;z// ?>=<89:;q3 that has both y and z not Λ

(x ∈ Σ ∪ {Λ}) by the transitions ?>=<89:;q1 x;y;Λ// ?>=<89:;q2 Λ;Λ;z

// ?>=<89:;q3 for some

new state q2 /∈ Q.

– Replace any transition ?>=<89:;q1 x;Λ;Λ// ?>=<89:;q3 by the transitions ?>=<89:;q1 x;y′;Λ

// ?>=<89:;q2 Λ;Λ;y′

// ?>=<89:;q3for some new state q2 /∈ Q and stack symbol y′ /∈ Γ.

Each nonterminal Apq in the CFG represents a sequence of transitions fromstate p to state q, with no net change to the stack.

Note that the first transition in the sequence must be a push, and the last

must be a pop: ?>=<89:;p x;Λ;y//?>=<89:;r ?>=<89:;s x′;z;Λ

//?>=<89:;q .

Case 1: y = zPut Apq → xArsx

′.

Case 2: y 6= z There must be some intermediate transition which pops the y

pushed by the first transition: ?>=<89:;p x;Λ;y//?>=<89:;r ?>=<89:;r′ x′′;y;Λ

// ?>=<89:;s′ ?>=<89:;s x′;z;Λ//?>=<89:;q .

Put Apq → Aps′As′q.

18.11 PDA to CFG Formally

We transform P = (Σ,Γ, Q, q0, {qf}, δ) to G = (Σ, N, S, P ).Let N = Q × Q: nonterminals of the grammar are pairs of states of the

automaton. For convenience, we write Apq for the pair (p, q).Let S = Aq0qf

.

1. For each p ∈ Q, put App → Λ in P .

2. For each p, q, r ∈ Q, put Apq → AprArq in P .

3. For each p, q, r, s ∈ Q, y ∈ Γ, and x, x′ ∈ Σ ∪ {Λ},

if δ contains transitions ?>=<89:;p x;Λ;y//?>=<89:;r and ?>=<89:;s x′;y;Λ

//?>=<89:;q ,

put Apq → xArsx′ in P .

18.12 Example (L6 from slide 69)

GFED@ABC0−Λ;Λ;Λ

//

a;Λ;#

?>=<89:;1Λ;Λ;Λ

//

b;Λ;#

GFED@ABC2+

c;#;Λ

• We have just one final state.


• The Λ; Λ; Λ transitions are not allowed. Choose $ /∈ Γ, and put instead:

GFED@ABC0−Λ;Λ;$

//

a;Λ;#

?>=<89:;3Λ;$;Λ

//?>=<89:;1Λ;Λ;$

//

b;Λ;#

?>=<89:;4Λ;$;Λ

// GFED@ABC2+

c;#;Λ

1. A00 → Λ A11 → Λ A22 → Λ A33 → Λ A44 → Λ

2. A02 → A01A12

A03 → A01A13 | A02A23

A04 → A01A14 | A02A24 | A03A34

(N.B. should also have e.g. A01 → A03A31, but can see from the shape ofthe PDA that this will be useless.)

3. (a) Consider the transition ?>=<89:;0

a;Λ;#

. Its “pair” is ?>=<89:;2

c;#;Λ

.

So: A02 → aA02c.

(b) Consider the transition ?>=<89:;1

b;Λ;#

. Its “pair” is ?>=<89:;2

c;#;Λ

.

So: A12 → bA12c.

(c) Consider the transition ?>=<89:;0Λ;Λ;$

//?>=<89:;3 . Its “pair” is ?>=<89:;3Λ;$;Λ

//?>=<89:;1 . So:

A01 → ΛA33Λ, i.e. A01 → A33.

(d) Consider the transition ?>=<89:;1Λ;Λ;$

//?>=<89:;4 . Its “pair” is ?>=<89:;4Λ;$;Λ

//?>=<89:;2 . So:

A12 → ΛA44Λ, i.e. A12 → A44.

Summarising:A02 → A01A12 | aA02cA01 → A33

A12 → bA12c | A44

A33 → ΛA44 → Λ

and simplifying (A02 becomes S, A12 becomes T , everything else is Λ):

S → T | aScT → bT c | Λ

which indeed generates the language {ambncm+n | m,n ≥ 0}.

Chapter 19

Non-CF Languages

19.1 Not All Languages are Context Free

We know Reg ⊂ CFL: all regular languages are context free, but there arecontext free languages that are not regular.

Are there languages that are not context free?Yes, there are many such languages!The following languages cannot be generated by any context free grammar,

nor can they be recognized by any push-down automaton.{anbncn | n ≥ 0} (would need two counters){ww | w ∈ {a, b}∗} (would need queue, not stack){next week’s lotto numbers} (would need a miracle)

Many of the constraints on programming languages cannot be expressed(easily) using CFGs. For example:

• all the identifiers in a list of declarations must be distinct• identifiers must be declared before they are used• procedure calls must have arguments consistent with their declarations.

These are constraints on the context in which a particular piece of otherwisecontext-free syntax may occur. Approaches to dealing with them include:

ad hoc approaches: write a CFG to define the CF parts of the language(sometimes called a covering grammar), build a parser, then augmentthe parser with code to check the context constraints;

attribute grammars: CFGs annotated to show the extra relationships be-tween nonterminals;

context-sensitive grammars.

19.2 Context Sensitive Grammars

A phrase-structure grammar is a structure G = (Σ, N, S, P ), where Σ, N ,and S are as we have seen already.

109

110 CHAPTER 19. NON-CF LANGUAGES

Different classes of languages arise by placing different constraints on theproductions, P .

Let X,Y ∈ N ; φ ∈ Σ∗; and α, β ∈ (Σ ∪N)∗.

Regular grammar: X → φY or X → φ

Context free grammar: X → β

Context sensitive grammar: α→ β, where |α| ≤ |β|.

For example, a CSG may include a production abSd → abcTd, which saysin effect that S → cT , but only if it is preceded by ab and followed by d.

19.3 Example (1)

The following CSG generates the language described by the regular expression(a+b)∗(ac+bd).

S → aSS → bSaS → aTbS → bUT → cU → d

19.4 Example (2)

The following CSG generates the language {wcw | w ∈ {a, b}∗}.

S → c S → aTS S → bUSTa→ aT T b→ bT T c→ caUa→ aU Ub→ bU Uc→ cb

19.5 Generating the empty string

Our definition for CSG has productions α → β where |α| ≤ |β|. This doesn’tpermit Λ productions (|β| = 0), so we must also allow S → Λ if the language tobe generated contains Λ.

The following CSG generates the language {anbncn | n ≥ 0}.

S → Λ | aT bcaT b→ aaT bbU | abUb→ bUUc→ cc

19.6. CFL ⊂ CSL 111

19.6 CFL ⊂ CSL

We just saw an example of a language (anbncn) that is context sensitive but notcontext free (proof: pumping lemma for context free languages – not covered inthis course).

To show CFL ⊂ CSL, we need only show that every context free languageis context sensitive.

Let G = (Σ, N, S, P ) be a CFG for a language L. We will construct a CSGG′ = (Σ, N, S, P ′).

Without loss of generality, suppose P has no Λ productions, except perhapson S.

For each production X → β in P , put α→ β (where α = X) in P ′.Now every production has the form α→ β where α, β ∈ (Σ ∪N)∗,

|α| = 1, and (with perhaps one permitted exception), |β| ≥ 1,so G′ is a CSG.

112 CHAPTER 19. NON-CF LANGUAGES

Chapter 20

Closure Properties

20.1 Closure Properties

We now turn our attention to closure properties.Recall that regular languages are closed under union, concatenation, Kleene

closure, complementation, and intersection. That is, if L1 and L2 are regular:

• L1 ∪ L2 is regular;

• L1L2 is regular;

• L∗1 is regular;

• L1 is regular; and

• L1 ∩ L2 is regular.

Do the corresponding closure properties hold for context-free languages?o

20.2 Union of Context Free Languages is Con-text Free

Theorem If L1 and L2 are context free languages, their union L1 ∪ L2 is alsocontext free.

Proof (using grammars) LetG1 = (Σ1, N1, S1, P1) andG2 = (Σ2, N2, S2, P2)be CFGs for L1 and L2 respectively.

Without loss of generality, let N1 ∩N2 = ∅ (if not, systematically renameall nonterminals in one of the grammars). Also, let S 6∈ N1∪N2 be a freshnonterminal symbol.

G3 = (Σ1 ∪Σ2, N1 ∪N2 ∪ {S}, S, P1 ∪P2 ∪ {S → S1, S → S2}) is aCFG for L1 ∪ L2.

113

114 CHAPTER 20. CLOSURE PROPERTIES

Example

Language(G1) = {ambm} Language(G2) = {bncn}S → aSb S → bScS → Λ S → Λ

First, rename the nonterminals of G1 and G2 apart, by adding subscripts:

G1 = ({a, b, c}, {S1}, S1, {S1 → aS1b, S1 → Λ})G2 = ({a, b, c}, {S2}, S2, {S2 → bS2c, S2 → Λ})

Now form the union as above; the resulting grammar:

S → S1 | S2 S1 → aS1b | Λ S2 → bS2c | Λ

is a CFG for {ambm} ∪ {bncn}.

Alternative proof (using PDAs) Let P1 = (Σ1,Γ1, Q1, q1, F1, δ1) and P2 =(Σ2,Γ2, Q2, q2, F2, δ2) be PDAs for L1 and L2 respectively.

Without loss of generality, let Q1 ∩ Q2 = ∅, and let q0 6∈ Q1 ∪ Q2 be afresh state.

P3 = (Σ1∪Σ2, Γ1∪Γ2, Q1∪Q2∪{q0}, q0, F1∪F2, δ3) is a PDAfor L1 ∪ L2, where δ3 is δ1 ∪ δ2 together with transitions δ3(q0,Λ,Λ) 7→(q1,Λ) and δ3(q0,Λ,Λ) 7→ (q2,Λ).

Example

P1 : GFED@ABCq−1 Λ;Λ;Λ//

a;Λ;#

GFED@ABCq+2

b;#;Λ

P2 : GFED@ABCq−3 Λ;Λ;Λ//

b;Λ;#

GFED@ABCq+4

c;#;Λ

P3 : ?>=<89:;q1Λ;Λ;Λ

//

a;Λ;#

GFED@ABCq+2

b;#;Λ

GFED@ABCq−0

Λ;Λ;Λ

??��

Λ;Λ;Λ

��>>

>>>>

>>>

?>=<89:;q3Λ;Λ;Λ

//

b;Λ;#

GFED@ABCq+4

c;#;Λ

20.3 Concatenation of CF Languages is ContextFree

Theorem If L1 and L2 are context free languages, their concatenation L1 ⌢ L2

is also context free.

20.4. KLEENE STAR OF A CF LANGUAGE IS CF 115

Proof (using grammars) Let G1 and G2 be as before, with no shared non-terminals, and S a fresh nonterminal.

G3 = (Σ1 ∪Σ2, N1 ∪N2 ∪ {S}, S, P1 ∪P2 ∪ {S → S1 S2}) is a CFGfor L1 ⌢ L2.

Example: L1 = {ambm} L2 = {bncn} L3 = {ambmbncn} = {ambm+ncn}

G1 = ({a, b}, {S1}, S1, {S1 → aS1b, S1 → Λ})G2 = ({b, c}, {S2}, S2, {S2 → bS2c, S2 → Λ})G3 = ({a, b, c}, {S, S1, S2}, S,

{S → S1S2, S1 → aS1b, S1 → Λ, S2 → bS2c, S2 → Λ})

20.4 Kleene Star of a CF Language is CF

Theorem If L is a context free language, so is L∗.

Proof (using grammars) Let G = (Σ, N, S, P ), and let S′ 6∈ N be a freshnonterminal.

G′ = (Σ, N ∪ {S′}, S′, P ∪ {S′ → Λ, S′ → SS′}) is a CFG for L∗.

Example L = {ambm} L′ = {ambm}∗

G = ({a, b}, {S}, S, {S → aSb, S → Λ})G′ = ({a, b}, {S, S′}, S′,

{S′ → SS′, S′ → Λ, S → aSb, S → Λ})

Exercise Eliminate the Λ productions from G′.

20.5 Intersections and Complements of CF Lan-guages

Theorem The intersection L1 ∩L2 of CF languages L1 and L2 may be CF, orit may not.

Proof We already know that intersections of regular languages are regular, soif L1 and L2 are regular (and hence CF), L1∩L2 is also regular (and henceCF).

However, if we take L1 = {anbncm} and L2 = {anbmcm}, which are bothCF (see Cohen, p385), the intersection L3 = {anbncn}, which we havealready seen is not CF.

Theorem The complement L′ of a CF language L may or may not be CF.

Proof Suppose L1 and L2 are CF; we know L1 ∩ L2 may be non-CF.

However, if complements were CF, (L′1 ∪ L′

2)′ would be CF: contradic-

tion!

116 CHAPTER 20. CLOSURE PROPERTIES

Chapter 21

Summary of CF Languages

21.1 Why context-free?

• There are non-regular languages

– Proof: the pumping lemma

• Sentences of many languages have a natural nested or recursive structure

– contrast with regular languages, whose structure is essentially se-quence/selection/repetition

• the nested structure can be exposed by writing context free grammars,and by drawing parse trees

– terminals appear in sentences, and at the leaves of parse trees

– nonterminals name categories of fragments of sentences, and on theinterior nodes of parse trees.

– productions describe how the nonterminals relate to one another, andto the terminals.

21.2 Phrase-structure grammars

A phrase-structure grammar G = (Σ, N, S, P ) where:

• Σ is a finite set of terminals;

• N is a finite set of nonterminals, Σ ∩N = ∅;

• S is the start symbol, S ∈ N ; and

• P is a set of productions.

A string (or sentence) is a sequence of terminals Σ∗.A sentential form is a sequence of terminals and/or nonterminals (Σ∪N)∗.A production is a pair of sentential forms, written α→ β (α, β ∈ (Σ∪N)∗).

117

118 CHAPTER 21. SUMMARY OF CF LANGUAGES

21.3 Special cases of Phrase-structure grammars

• A Context Free Grammar (CFG) is a phrase-structure grammar in whichevery production α → β has α ∈ N . That is, the left-hand side of everyproduction is a single nonterminal.

We call the class of languages generated by CFGs context free languages

(CFL).

• A Regular Grammar (RG) is a CFG with the additional property that, forevery production α→ β, β ∈ Σ∗ or β ∈ Σ∗N . That is, the right-hand sideof every production is a sequence (perhaps empty) of terminals, optionallyfollowed by a single nonterminal.

The class of languages generated by RGs is the same as the class acceptedby finite automata, so by Kleene’s theorem is the same as the class ofregular languages.

Every RG is a CFG, so every regular language is a context free language:Reg ⊆ CFL.

• A Context Sensitive Grammar (CSG) is a phrase-structure grammar inwhich every production α → β has either |α| ≤ |β|, or α = S and β = Λ.That is, the left- and right-hand sides are arbitrary sentential forms, withthe only restriction being that the right-hand side is no shorter than theleft-hand side, except that there may be a single Λ production for the startsymbol. They are sometimes called non-reducing grammars.

We call the class of languages generated by CSGs context sensitive lan-

guages (CSL).

Every CFG may be transformed to an equivalent CFG whose only Λ pro-duction is on the start symbol, so every CFG is equivalent to a CSG.Hence, every context-free language is a context sensitive language: CFL ⊆CSL.

21.4 Derivations

Given G = (Σ, N, S, P ) and w ∈ Σ∗, a derivation in G of w is a sequenceα0 . . . αn of sentential forms, where α0 = S, alphan = w, and for each 0 ≤ i < n,αi+1 is derived from αi by replacing some nonterminal n in αi by β, where(n→ β) ∈ P .

G generates w iff there is a derivation in G of w.A leftmost derivation is one in which the leftmost nonterminal of αi is always

replaced.If there is a derivation in G of w, there is a leftmost derivation.String w is ambiguous for G if there is more than one leftmost derivation in

G of w.G is ambiguous if at least one w is ambiguous for G. There may or may not

be an equivalent unambiguous grammar.

21.5. PARSING 119

21.5 Parsing

Parsing is the process of finding derivations.The derivation may be summarised by a parse tree.Parsing in regular grammars essentially simulates the operation of a NFA.Parsing in arbitrary grammars is, in general, more difficult.

21.6 Recursive descent and LL(1) grammars

An LL(1) grammar is a CFG that may be parsed top-down and deterministically.A grammar is LL(1) if every nonterminal satisfies the two requirements.If a grammar is LL(1), there may or may not be an equivalent LL(1) gram-

mar.A recursive descent parser is a “one-off” recogniser for an LL(1) grammar.

21.7 Pushdown automata

A PDA P = (Σ,Γ, Q, q0, F, δ) is an automaton that can accept a string w ifthere is a path from the initial state q0 to some final state in F such that:

• the stack is empty initially and finally;

• the sequence of read symbols along the path is w;

• each pop symbol along the path matches the symbol currently at the topof the stack.

The class of languages accepted by PDAs is exactly same as the class oflanguages generated by CFGs (i.e., the context-free languages).

A PDA is deterministic (DPDA) if, for every q ∈ Q, x ∈ Σ, and y ∈ Γ, thereis at most one enabled transition.

There may or may not be a DPDA that accepts some CFL.

21.8 Closure

• There are non-CF languages: for example, {anbncn | n ≥ 0}.

– Proof : the pumping lemma for CF languages (NOT DONE).

• The class of CFLs is closed under:

– union

– concatenation

– Kleene closure

• The class of CFLs is not closed under:

120 CHAPTER 21. SUMMARY OF CF LANGUAGES

– intersection

– complementation

21.9 Constructions on CFLs

• FA to regular grammar

• Regular grammar to FA (NOT DONE)

• Ambiguous to unambiguous grammar (perhaps)

• Remove Lambda productions

• Remove unit productions

• CFG to LL(1) form (perhaps)

• CFG to PDA (top-down and bottom-up)

• PDA to CFG

Part III

Turing Machines

121

Chapter 22

Turing Machines I

22.1 Introduction

So far in COMP 202 you have seen:

• finite automata

• pushdown automata

In this part of COMP 202 we will look at Turing machines.Turing machines are named after the English logician Alan Turing. They

were introduced in 1936 in his paper ‘On computable numbers with an applica-tion to the Entscheidungsproblem.’

We can think of finite automata, pushdown automata and Turing machinesas models of computing devices. A finite automaton has

• a finite set of states

• no memory.

A pushdown automaton has


• unlimited memory with restricted access.

A Turing machine has


• unlimited memory with unrestricted access

Dates:

• Finite automata were first described in 1943

• Pushdown automata were first described in 1961

• Turing machines were first described in 1936

123

124 CHAPTER 22. TURING MACHINES I

22.2 Motivation

While thinking of finite automata, pushdown automata, and Turing machinesas machines of increasing power is quite useful it does not give any insight intowhy or how Turing invented his abstract machines. This story is worth telling.

Nowadays, it is impossible to say exactly how many computing devices thereare in the world, we can only give some vague estimate in terms of hundreds ofthousands, or millions. We can say with great precision how many computersthere were in the world in the early 1930’s: none.

How then did Turing come to invent Turing machines?

22.3 A crisis in the foundations of mathematics

In the beginning of the 20th century mathematics was facing a crisis in itsfoundations.

David Hilbert was the leading mathematician of the early 1900s. He believedthat mathematics would escape intact from the crisis it faced.

Hilbert believed that mathematics was decidable: that is, for every mathe-matical problem, there is an algorithm which either solves it or shows that nosolution is possible.

The problem of showing that mathematics was decidable came to be knownas the Entscheidungsproblem: German for ‘decision problem’.

In order to make progress on this problem it was necessary to get a cleareridea about

• algorithms, and

• the class of computatble functions.

22.4 The computable functions: Turing’s approach

The question that Alan Turing was really trying to investigate was “What classof functions can be computed by a person?”

In the 1930’s a ‘computer’ was a person who performed calculations. Thecalculations must be algorithmic in nature: that is they must be the sorts ofthing one could, in principle, build a machine to perform.

In ‘On computable numbers with an application to the Entscheidungsprob-

lem’, Turing imagines what actions a computer could perform, and tries toabstract away the details.

22.5 What a computer does

Imagine a person sitting at a desk performing calculations on paper.The person can:

• read the symbols that have been written on the paper,

22.6. WHAT HAVE WE ACHIEVED? 125

• write symbols on the paper,

• erase what has been written, and

• perform actions dependent on what symbols were read.

We abstract away some of the details.We assume that the paper is in the form of a tape of individual squares, each

of which can either be blank or hold just one symbol.The computer can focus attention on only one cell of the tape at a time.We assume that the tape is not limited.The actions that the abstract computer can perform are then

• reading a symbol,

• writing a symbol,

• erasing a symbol, and

• focussing attention on the next or the previous cell.

We can give a formal description of these abstract machines.Turing then asserts that these actions are the sorts of things one could, in

principle, build a machine to perform.Turing did, in fact, become involved with the early efforts to build real,

physical machines, but in the 1930’s his focus was on purely abstract machines.

22.6 What have we achieved?

Now we have a formal model of an abstract computer we can study the func-tions that it can compute. We can call such functions “the Turing machinecomputable functions”.

There is no certainty that the Turing machine computable functions are alland only the computable functions, because we may have made a mistake whenanalysing the actions of the computer.

22.7 The computable functions: Church’s ap-proach

At the same time that Turing was thinking about abstract machines the Ameri-can logician Alonzo Church was taking a different approach to defining the classof computable functions.

Church had developed a notation for describing functions, called the λ-calculus. The language of the λ-calculus is very simple. A term of the λ-calculusis:


• a variable, or

• an application of two λ-terms, or

• the abstraction of a variable over a λ-term

More formally:

TERM → VAR

| TERM TERM

| λ VAR . TERM

VAR → x1, x2, x3, . . .

We have one rule, called β-reduction:

(λx.M)N →β [N/x]M

where [N/x]M is read as “substitute N for x in M”.Substitution is algorithmic. Now, the λ-calculus looks nothing like Turing

machines, and it was constructed on a completely different basis.Nonetheless, the λ-calculus lets us define functions. We can call such func-

tions “the λ-calculus computable functions”.Just as before, there is no certainty that the λ-calculus computable functions

are all and only the computable functions.

22.8 First remarkable fact

Although the λ-calculus and Turing’s abstract machines look completely differ-ent it turns out that they both define exactly the same class of functions.

So the λ-calculus computable functions are just the same functions as theTuring machine computable functions.

Many other approaches to defining the class of computable functions havebeen proposed, e.g.

• the µ-recursive functions

• Post systems

• unlimited register machines (URMs)

• Minsky systems

• . . .

All have been shown to define exactly the same class of functions.Furthermore no-one has come up with a function which is obviously com-

putable and which is not in this class.

22.9. CHURCH-TURING THESIS 127

22.9 Church-Turing thesis

The assertion that the class of Turing machine computable functions is the classof computable functions is called the Church-Turing thesis.

The Church-Turing thesis not not something that can be formally proven.This is not because we are stupid, or lack cunning, but because it relates ourinformal notion of computable with a formal system.

It is, of course, possible to give a formal proof that the λ-calculus computablefunctions are just the same functions as the Turing machine computable func-tions.

22.10 Second important fact

Every Turing machine embodies an algorithm. We can think of the initialconfiguration of the tape for a Turing machine as the data (or input) which themachine is supplied with.

• We can describe one Turing machine to another by using symbols on a

tape.

• We can construct a Turing machine TU which takes as input a descriptionof any Turing machine T1, and behaves just like T1.

22.11 The universal machine

A machine like TU is called a universal Turing machine.

A universal Turing machine can be made to behave like any Turing machine,just by supplying it with appropriate data.

The existence of universal Turing machines is quite remarkable.

Physical calculating machines had been constructed prior to the 1930’s butthese were all special purpose machines.

Turing had shown that special purpose machines are pointless:

• if you want a machine to add up tables of financial data, build a universalmachine and then describe the appropriate special machine to it;

• if you want a machine to find numeric solutions to differential equations,build a universal machine and then describe the appropriate special ma-chine to it;

• if you want a machine to play music backwards, build a universal machineand then describe the appropriate special machine to it;

• if you want a machine to do anything (algorithmic): build a universalmachine and then describe the appropriate special machine to it.


In modern parlance we call a universal Turing machine “a computer”, and wecall the process of supplying it with appropriate data to mimic another Turingmachine “programming”.

This is the sense in which a computer is a general purpose machine.Of course in 1936 it was not possible to build a practical physical approxi-

mation to a universal Turing machine.

22.12 Third important fact

We have still not seen what Turing machines have to do with the Entschei-

dungsproblem.We can use the idea of a universal Turing machine to show that there are

indeed undecidable problems in mathematics.Because a universal Turing machine can be used to encode any Turing ma-

chine we can use universal machines to ask questions about Turing machinesthemselves. We can use this technique to construct a purely mathematicalproblem which is undecidable.

Chapter 23

Turing machines II

23.1 Introduction

In the last lecture we looked at how Turing machines came to be developed, andgave an informal description of

• the Church-Turing thesis

• Universal Turing machines

• formally undecidable problems

Now we will proceed with a formal development of the theory of Turingmachines.

Recall that we stated that we can think of finite automata, pushdown au-tomata and Turing machines as models of computing devices. A finite automa-ton has


• no memory.

A pushdown automaton has


• unlimited memory with restricted access.

A Turing machine has


• unlimited memory with unrestricted access

129

130 CHAPTER 23. TURING MACHINES II

23.2 Informal description

Recall that our informal description of a Turing machine was that there was:

• an infinite tape

• a head, which can:

– read a symbol

– write a symbol

– erase a symbol

– move left

– move right

23.3 How Turing machines behave: a trichotomy

Consider the following program (adapted from Cohen):

read x;

if x < 0 then halt;

if x = 0 then x := 1/x;

while x > 0 do x := x + 1

This program can do 3 things:

• it can halt;

• it can crash;

• it can run forever.

Turing machines exhibit the same behaviour. They can:

• halt;

• crash;

• run forever.

23.4 Informal example

Suppose we want to define a Turing machine to accept the language

{w#w|w ∈ {0, 1}∗}

Our input string will be presented to us on a tape and initially the head islocated at the leftmost end of the tape.

For example the input might be 10011#10011.How could we decide whether the string was in the language?Clearly the head is going to have to move back and forth along the string.

23.5. TOWARDS A FORMAL DEFINITION 131

Let’s see how we might go about this.What we do first depends on whether we are looking at a 0, 1 or a #.If we are looking at a # then we must move right and check that the next

cell is empty.If is is a 0 or a 1 then what should we do?

• we should mark the cell as visited

• we should go off and find the #

• then we should find a 0 or a 1 as appropriate in the next cell

• then we should mark this cell

• and then we should repeat this until we are finished

We can mark a cell by writing a new symbol, say x in it.But of course now we have a tape with 0’s, 1’s, #’s and x’s in it so our

method will have to change a bit.

23.5 Towards a formal definition

The formal definition of a Turing machine follows a similar pattern to the formaldefinitions that we have given of finite automata and pushdown automata.

Different textbooks give slightly different definitions. For example

• John Martin’s ‘Introduction to languages and the theory of computation’defines a Turing machine as a 5-tuple,

• Michael Sipser’s ‘Introduction to the theory of computation’ defines aTuring machine as a 7-tuple, and

• Daniel Cohen’s ‘Introduction to computer theory’ splits the difference andsays we have to give 6 things to define a Turing machine.

So, care must be taken when reading from more than one source. We shalluse Cohen’s definition, with slight adaptations.

23.6 Alphabets

In a Turing machine we need two alphabets,

• Σ, the input alphabet

• Γ, the tape alphabet

We use ∆ for the blank symbol, much as we use Λ for the empty string and ∅for the empty language.

Cohen stipulates that ∆ 6∈ Σ and ∆ 6∈ Γ.Often we will have Σ ⊂ Γ.


23.7 The head and the tape

We have an infinite tape, and a head which is located at one of the cells.

We supply our Turing machine with input w = w1w2 . . . wn−1wn by enteringthe symbols w1, w2 . . . wn−1, wn in the first n cells of the tape.

All the other cells in the tape initially have ∆ in them.

Initially the head is located at the first cell in the tape.

Because we can write on the tape we can use it as a memory.

23.8 The states

We have a finite set of states: Q.

One of the states is the start state: q0 ∈ Q.

Some subset of the states are the halt states: F ⊆ Q.

23.9 The transition function

We have a transition function, δ, which depends on:

• which state we are in

• what symbol is in the cell the head is at

and which can tell us

• what state to go to

• what symbol to write in the cell the head is at

• whether to move the head left or right.

23.10 Configuration

So, as we compute we (usually) move through the states of the machine, and(usually) the head moves along the tape.

We can represent the configuration of the machine as a triple, constisting of:

• the state the machine is in

• the contents of the tape

• the location of the head

23.11. FORMAL DEFINITION 133


Definition 12 (Turing machine) A Turing machine is a 6-tuple (Q,Σ,Γ, δ, q0, F )where:


• Σ is a finite set: the input alphabet

• Γ is a finite set: the tape alphabet

• δ is a function from Q × (Σ ∪ Γ ∪ {∆}) to Q × (Γ ∪ {∆}) × {L,R}: the

transition function



23.12 Representing the computation

We represent the configuration of a Turing machine like:

sσ1 . . . σk . . . σn

where:

• s is the state the machine is in

• σ1 . . . σk . . . σn is the ‘meaningful’ part of the tape

• σk is the symbol about to be read

For example, if the start state is S1, and the input is babba, then the initialconfiguration will be:

S1

babba

We can use a sequence of configurations to trace the computation that themachine performs.

23.13 A simple machine

Let M1 = (Q,Σ,Γ, δ, q0, F ), whereQ = {S1, S2, S3, S4}Σ = {a, b} Γ = {a, b} q0 = S1 F = {S4}

and δ is given by the table:

State Reading State Writing MovingS1 a S2 a RS1 b S2 b RS2 b S3 b RS3 a S3 a RS3 b S3 b RS3 ∆ S4 ∆ R



GFED@ABC1−a,a,R

**

b,b,R

44?>=<89:;2b,b,R

//?>=<89:;3

a,a,R

b,b,R

UU

∆,∆,R// GFED@ABC4+

• Every time we read a symbol we move one step to the right along the tape

• Writing the symbol we have just read leaves the cell unaffected by ourvisit.

23.15 Some traces

If we give this machine abb we will get:

S1

abb →S2

abb →S3

abb →S3

abb∆ →S4

abb∆∆

If we give this machine bab we will get:

S1

bab →S2

bab crash

A little thought should show that this machine accepts Language((a + b)b(a +b)∗)

We should not be surprised that this machine accepts a regular languageas it just traversed the input string from left to right and did not change thecontents of the tape.

23.16 Turing machines can accept regular lan-

guages

This is a general property, which we can express as a theorem.

Theorem 15 Every regular language is can be accepted by a Turing machine.

The proof consists of taking an FA which accepts a regular language and turningit into a Turing machine which accepts the same language. Basically, we add anew halt state and adjust the transition function.

23.17 Proof

Let L be a regular language. Then L is accepted by an FA ML = (Q,Σ, δ, q0, F ).Then TL = (Q′,Σ′,Γ, δ′, q′0, F

′) where:

23.18. ANOTHER MACHINE 135

• F ′ = {SHALT}

• Q′ = Q ∪ F ′

• Σ′ = Σ

• Γ = Σ

• if δ(s, σ) is defined then δ′(s, σ) = (δ(s, σ), σ, R)

• ∀f ∈ F δ′(f,∆) = (SHALT,∆, R)

is a Turing machine which accepts L.

23.18 Another machineLet M2 = (Q,Σ,Γ, δ, q0, F ), where Q = {S1, S2, S3, S4, S5, S6}, Σ = {a, b},Γ = {a, A,B}, q0 = S1, F = {S6}, and δ is given by:

State Reading State Writing MovingS1 a S2 A RS2 a S2 a RS2 B S2 B RS2 b S3 B LS3 B S3 B LS3 A S5 A RS3 a S4 a LS4 a S4 a LS4 A S1 A RS5 B S5 B RS5 ∆ S6 ∆ R


GFED@ABC1−a,A,R

//?>=<89:;2

a,a,R

B,B,R

UU

b,B,L//?>=<89:;3

B,B,L

A,A,R

//

a,a,L}}zz

zzzz

zzzz

?>=<89:;5

B,B,R

∆,∆,R

// GFED@ABC6+

?>=<89:;4

a,a,L

UU

A,A,R

bbFFFFFFFFFF

This machine accepts the language anbn, n > 0.

This machine does write on the tape, and the head moves both left andright.


23.20 A trace

If we give this machine aabb we will get:

S1

aabb →S2

Aabb →S2

Aabb →S3

AaBb →S4

AaBb →S1

AaBb →S2

AABb →S2

AABb →S3

AABB →S3

AABB →S5

AABB →S5

AABB →S5

AABB∆ →S6

AABB∆∆

To get a clearer idea of what is going on here try tracing the computationon a longer string like aaaabbbb.

Chapter 24

Turing Machines III

24.1 An example machine

Consider the following machine T1 = (Q,Σ,Γ, δ, q0, F ), where Q = {S1, S2, S3}Σ = {a, b} Γ = {a, b} q0 = S1 F = {S3} and δ is given by the table:State Reading State Writing MovingS1 ∆ S1 ∆ RS1 b S1 b RS1 a S2 a RS2 a S3 a RS2 b S1 b R

GFED@ABC1−

∆,∆,R

b,b,R

TT

a,a,R**?>=<89:;2

b,b,R

kka,a,R

// GFED@ABC3+

Now, T1 behaves as follows:

• if the string contains a aa we reach the halt state, so the string is accepted

by the TM

• if the string does not contain a aa and ends in an a then the machinecrashes

• if the string does not contain a aa and ends in a b then the machine loops

24.2 Some definitions

Any Turing machine can exhibit this trichotomy, so, for every Turing machineT we define

137

138 CHAPTER 24. TURING MACHINES III

• accept(T ) to be the set of strings on which T halts

• reject(T ) to be the set of strings on which T crashes

• loop(T ) to be the set of strings on which T loops

24.3 Computable and computably enumerablelanguages

Definition 13 (Computable language) A language L is computable if there

is some Turing machine T such that:

1. accept(T ) = L

2. loop(T ) = ∅

3. reject(T ) = L

Condition 2 tells us that the Turing machine must either halt gracefully orcrash: it cannot go on forever.

Definition 14 (Computably enumerable language) A language L is com-

putably enumerable if there is some Turing machine T such that:

1. accept(T ) = L

2. loop(T ) ∪ reject(T ) = L

Some authors (Turing and Cohen included) use the terms “recursive” and“recursively enumerable”.

Theorem 16 Every computable language is computably enumerable

Proof Suppose L is computable. Then by definition there is a Turing ma-chine such that

accept(T ) = L

loop(T ) = ∅

reject(T ) = L

Now, loop(T ) ∪ reject(T ) = ∅ ∪ L = L.Hence every computable language is computably enumerable.

24.4 Deciders and recognizers

Informally, we can think of a computable language as being one for which we canwrite a decider, a program whose behaviour is sure to tell us whether a stringis in the language or not.

24.5. A DECIDABLE LANGUAGE 139

We can think of a computably enumerable language as one for which we canwrite a recognizer. A recognizer is a program which will tell us if a string is inthe language, but which may loop if the string is not in the language.

A language for which we can write a decider is called a decidable language.A language for which we can write a recognizer is called a semi-decidable

language.

24.5 A decidable language

Recall that a language is just a set of strings, so we can think of this problemin terms of whether membership of some arbitrary set is decidable.

As long as we can encode the elements of the set as strings, the set is alanguage.

Let B be a finite automaton, and w a string over the alphabet of B. Considerthe set:

AFA = {(B,w)|w ∈ Language(B)}

Is AFA decidable, i.e. is it decidable whether a string is in the language of afinite automaton?

Theorem 17 If B is a finite automaton, and w a string over the the alphabet

of B then {(B,w)|w ∈ Language(B)} is decidable.

Proof We already know that, for any FA B, there is a TM B′ that acceptsthe same language. By inspection of the construction, we can easily tell thatloop(B′) = ∅.

We must construct a Turing machine which takes as input a description ofB′ and the string w, and which halts if w ∈ accept, and crashes otherwise.

A universal TM can do this.

24.6 Another decidable language

If A is an FA, then the set:

{A|Language(A) = ∅}

is also decidable.We must write a program which takes (a description of) A and checks

whether any accepting state can be reached from the start state.

24.7 A corollary

Let Language(C) = (Language(A)∩Language(B))∪(Language(A)∩Language(B))Language(C) is the symmetric difference of Language(A) and Language(B).Since the set of regular languages is closed under complementing, intersection

and union, then Language(C) is regular if Language(A) and Language(B) are.

140 CHAPTER 24. TURING MACHINES III

Now, Language(C) = ∅ iff Language(A) = Language(B).So, in order to check whether two FA are equal we only have to write a

program which takes their descriptions, constructs the FA which accepts theirsymmetric difference and checks whether this is ∅.

24.8 Even more decidable sets

We have discussed FA and regular languages. What about the set:

APDA = {(P,w)|w ∈ Language(P )}

where P is a pushdown automaton, and w a word of the input alphabet ofP . Is this set decidable?

It turns out that this set also is decidable.This is why we like context-free grammars: we can be sure we can write

parsers for them. In fact, the proof that APDA is decidable allows us to constructa parser from any CFG.

24.9 An undecidable set

What about the set:

ATM = {(T,w)|w ∈ accept(T )}

where T is a Turing machine and w is a word over the input alphabet of T .This set is not decidable. We will see why later.

Chapter 25

Turing Machines IV

25.1 Introduction

We are looking at how we can construct an undecidable set. Recall:

Definition 15 (Computable language) A language L is computable if there

is some Turing machine T such that:

1. accept(T ) = L

2. loop(T ) = ∅

3. reject(T ) = L

Informally, we can think of a computable language as being one for whichwe can write a decider, a program whose behaviour is sure to tell us whether astring is in the language or not.

A language for which we can write a decider is called a decidable language.

Definition 16 (Computably enumerable language) A language L is com-

putably enumerable if there is some Turing machine T such that:

1. accept(T ) = L

2. loop(T ) ∪ reject(T ) = L

25.2 AFA is decidable

Let B be a finite automaton, and w a string over the the alphabet of B, then

AFA = {(B,w)|w ∈ Language(B)}

is decidable. We can construct a Turing machine which takes as input (adescription of) B and the string w, and which halts if w ∈ Language(B), andcrashes otherwise.

141

142 CHAPTER 25. TURING MACHINES IV

25.3 APDA is decidable

Let P be a pushdown automaton, and w a string over the the alphabet of P ,then

APDA = {(P,w)|w ∈ Language(P )}

is decidable. We can construct a Turing machine which takes as input (adescription of) P and the string w, and which halts if w ∈ Language(P ), andcrashes otherwise.

25.4 The halting problem

Let T be a Turing machine, and w a string over the the input alphabet of T ,then


is not decidable. This problem is often called the halting problem, becauseit asks about the halting behaviour of Turing machines. ProofSince we are trying to show that ATM is not decidable we assume that ATM is

decidable and derive a contradiction.If ATM is decidable then there is a Turing machine H such that:

1. accept(H) = ATM

2. loop(H) = ∅

3. reject(H) = ATM

We can write this as:

w ∈ accept(T ) implies (T,w) ∈ accept(H)

w 6∈ accept(T ) implies (T,w) ∈ reject(H)

or we could treat H as a little program:

H(T,w)△

= if T (w) haltsthen “yes”else “no”

Now, suppose we define a new machine D, which accepts machines which do not

accept themselves as input.There is no reason why we cannot give a Turing machine itself as input, any

more than there is a reason why we cannot give a program itself as input.We can write D as a little program:

D(T )△

= if H(T, T ) = “yes”then crashelse halt

Now, what happens if we give D itself as input?

25.5. OTHER UNDECIDABLE PROBLEMS 143

D(D) = if H(D,D) = “yes”then crashelse halt

But our definition ofH requires that H(D,D) = “yes” precisely when D(D)halts.

So, D(D) = if D(D) halts then crash else halt.That is, D ∈ reject(D) if and only if D ∈ accept(D).Now, it is not possible for D ∈ accept(D) and D ∈ reject(D) both to hold,

so D ∈ loop(D).Hence, D(D) loops; therefore H(D,D) loops.This is the contradiction that we sought.Hence, if T is a Turing machine, and w a string over the the input alphabet

of T , then


is not decidable.Our undecidable problem really turns on the existence of universal Turing

machines, i.e. on the fact that Turing machines are powerful enough to describethemselves and their own behaviour.

25.5 Other undecidable problems

Are there “real” problems which are undecidable, or are they merely mathe-matical curiosities?

First order logic It is not possible to write a theorem prover which, given alogical expression, is certain to be able to say whether the expression canbe proved or not.

It is possible to write a semi-decision procedure: given a provable expres-sion it is possible to say that it is provable.

Verification It is not possible to write a program which, given a specificationS and a program P , will determine whether the program P meets thespecification S.

Programming It is not possible to write a program which, given a specificationS, will construct a program P that meets it.

25.6 Closure properties

Theorem 18 The class of computable languages is closed under

1. union,

2. intersection, and


3. complementation.

(It is closed under concatenation and Kleene closure as well, but we won’tprove these results.)

Proof: Suppose L1 and L2 are computable. Then there exist Turing ma-

chines T1 and T2 such that:accept(T1) = L1 reject(T1) = L1 loop(T1) = ∅accept(T2) = L2 reject(T2) = L2 loop(T2) = ∅

1. (Union) Let T3 be a TM that simulates T1 and T2 simultaneously. Forexample, it might perform steps of T1 and T2 in turn, on disjoint parts ofthe tape. Let T3 halt if T1 OR T2 halts; otherwise, both T1 and T2 crash,so T3 crashes.

accept(T3) = accept(T1) ∪ accept(T2) = L1 ∪ L2

reject(T3) = reject(T1) ∩ reject(T2) = L1 ∩ L2 = L1 ∪ L2

loop(T3) = ∅

2. (Intersection) Let T4 simulate T1 and T2 simultaneously. Let T4 crash ifT1 OR T2 crashes; otherwise, both halt, so T4 halts.

accept(T4) = accept(T1) ∩ accept(T2) = L1 ∩ L2

reject(T4) = reject(T1) ∪ reject(T2) = L1 ∪ L2 = L1 ∩ L2

loop(T4) = ∅

3. (Complementation) Let T5 simulate T1. Let T5 crash if T1 halts, and haltif T1 crashes (one of these must eventually happen!).

accept(T5) = reject(T1) = L1

reject(T4) = accept(T1) = L1

loop(T4) = ∅

25.7 Computable and computably enumerablelanguages

Theorem 19 A language L is computable iff L is c.e. and L is c.e.

Proof: “If” is easy: L is computable, so by closure, L is computable. Allcomputable languages are c.e.

“Only if” is harder. Let T1 and T2 be TMs such that:

accept(T1) = L reject(T1) ∪ loop(T1) = Laccept(T2) = L reject(T2) ∪ loop(T2) = L

Construct a TM T that simulates T1 and T2 simultaneously. If T1 halts, Thalts. If T2 halts, T crashes. One of these must happen, since every string wbelongs either to L or to L.

accept(T ) = accept(T1) = Lreject(T ) = accept(T2) = Lloop(T ) = ∅

25.8. A LANGUAGE WHICH IS NOT C.E. 145

25.8 A language which is not c.e.

Now that we have an undecidable language we can go further and define alanguage which is not even computably enumerable.

Theorem 20 ATM is not c.e.

Proof: ATM is c.e., by universal Turing machine.If ATM were also c.e. then ATM would be computable (Theorem 2).But ATM is not computable, so ATM is not c.e.


Chapter 26

Turing Machines V

26.1 A hierarchy of classes of language

We have now seen in COMP 202 a whole collection of classes of languages:

• all possible languages

• c.e. languages

• computable languages

• context-sensitive languages

• context-free languages

• regular languages

• finite languages

Each of these is a proper subset of the one above it.

26.2 A hierarchy of classes of grammar

In the 1950’s the linguist Noam Chomsky produced a hierarchy of classes ofgrammars, corresponding exactly to the four classes above:

Type Grammar LanguageType 0 phrase structure grammars computably enumerable languagesType 1 context sensitive grammars context sensitive languagesType 2 context free grammars context free languagesType 3 regular grammars regular languages

147

148 CHAPTER 26. TURING MACHINES V

26.3 Grammars for natural languages

Chomsky’s hierarchy is important, although it does not include every possibleclass of language. We can look for a finer structure than the one we havepresented (LL(1) grammars, for example), but this is outside the scope of thiscourse.

Chomsky was (is) a linguist, and he was really concerned with what sorts ofgrammars are required to describe natural languages, like English, Maori, Urdu,Swahili and so on.

Consider the following examples (adapted from Gazdar and Mellish’s Natural

Language processing in PROLOG):

• A doctor hired another doctor.

• A doctor whom a doctor hired hired another doctor.

• A doctor whom a doctor whom a doctor hired hired hired another doctor.

• A doctor whom a doctor whom a doctor whom a doctor hired hired hiredhired another doctor.

• . . .

This sentence is of the form:

• A doctor (whom a doctor)n (hired)n hired another doctor.

so it is context free but not regular. Are there any phenomena in Englishwhich require us to go beyond context free?

Surprisingly, the answer is no!In fact there is only one natural language which requires that we use a

context-sensitive grammar.There is a structure which can occur in the dialect of Swiss-German spoken

around Zurich which make use of strings of the form:

ambncmdn

This apparent lack of complexity in natural languages seems surprising: sur-prising enough that the authors of books on artificial intelligence and on formallanguage theory regularly make false pronouncements on this issue. Moral: lookto linguists for facts about natural languages.

26.4 A hierarchy of classes of automaton

For four of these classes of language, there is a corresponding class of automaton:Type Automaton LanguageType 0 Turing machine computably enumerable languagesType 1 Linear-bounded automaton context sensitive languagesType 2 Pushdown automaton context free languagesType 3 Finite automaton regular languages

We have studied all except linear-bounded automata.

26.5. DETERMINISTIC AND NONDETERMINISTIC AUTOMATA 149

26.5 Deterministic and nondeterministic automata

We know that for Type 3 automata, nondeterminism makes no difference:Kleene’s theorem tells us that the class of languages accepted by nondetermin-istic finite automata (NFAs) is the same as the class accepted by deterministicfinite automata (FAs).

We know that for Type 2 automata, nondeterminism does make a differ-ence: there are languages that can be accepted by nondeterministic pushdownautomata (PDAs) that cannot be accepted by any deterministic pushdown au-tomaton (DPDA).

We ignore Type 1 automata.What about Type 0?

26.6 Nondeterminstic Turing Machines

Our definition of Turing Machines was deterministic: given a state and a symbolon the tape, δ tells us exactly which state to go to, what to write on the tape,and in which direction to move the tape:

δ : Q× (Σ ∪ Γ ∪ {∆}) → Q× (Γ ∪ {∆}) × {L,R}

This can easily be modified to allow nondeterministic Turing Machines (NTMs):

δ′ : Q× (Σ ∪ Γ ∪ {∆}) → 2Q×(Γ∪{∆})×{L,R}

A string w is accepted by a NTM N = (Q,Σ,Γ, δ′, q0, F ) if there is some

path from q0 to some q ∈ F on a tape loaded initially with w.We must be more careful about looping NTMs (what do we say if, for some

NTM N and some string w, there is a path through N that rejects, and anotherthat loops?).

If we consider only accepting paths, and do not distinguish rejecting fromlooping (so we cannot distinguish computable from computably enumerable lan-guages), we get the surprising result that nondeterminism makes no difference.

26.7 NTM=TM

Clearly any deterministic Turing Machine can be described by a NTM: nonde-terminism is not compulsory! So TM ⊆ NTM .

We must show NTM ⊆ TM : that is, that any NTM T may be simulatedby a TM T ′.

An NTM has only finitely many edges: label the edges of T with uniquenatural numbers. Now, for any string w ∈ accept(T ), there is at least one finitesequence of labels corresponding to T accepting w.

We design a Universal Turing Machine, T ′, that enumerates all paths in turn,checks whether the path is valid for w and T , and accepts when it finds onethat is (Cohen has the details). Hence, accept(T ) ⊆ accept(T ′).

150 CHAPTER 26. TURING MACHINES V

If w /∈ accept(T ), T ′ will try longer and longer paths and will never halt.Hence, loop(T ′) ⊆ accept(T ).

Together with reject(T ) = ∅ (obvious), we have accept(T ′) = accept(T ).

26.8 More variations on Turing Machines

Our Turing Machines have a distinguished set of final states, and at each stepread and write one tape symbol and move left or right.

It makes no difference to the class of languages accepted if, instead of a setof final states, we add a “halt” instruction (H):

δ : Q× (Σ ∪ Γ ∪ {∆}) → Q× (Γ ∪ {∆}) × {L,R,H}

It makes no difference to the class of languages accepted if, instead of insist-ing on a tape move at each step, we allow a “stay” instruction (S):

δ : Q× (Σ ∪ Γ ∪ {∆}) → Q× (Γ ∪ {∆}) × {L,R, S}

Our Turing Machines have a single tape that is infinite in both directions,a distinguished set of final states, and at each step read and write one tapesymbol and move left or right.

It makes no difference to the class of languages accepted if we restrict thetape so that it is infinite in only one direction (in fact, that is Cohen’s firstdefinition).

It makes no difference to the class of languages accepted if we allow multipleinfinite tapes: the TM decides what to do based on the values beneath the headon all k tapes, and writes all k tapes simultaneously. Of course, k > 0.

Chapter 27

Summary of the course

This lecture is a summary of the whole course.

27.1 Part 0 – Algorithms, and Programs

• Specifications

– signature

– preconditions

– postconditions

• Imperative languages and applicative languages

• Program verification

– assertions

– invariants

27.2 Part I – Formal languages and automata

• Definitions of alphabet, word, language . . .

• Regular expressions and regular languages

• Finite automata:

– Deterministic finite automata

– Nondeterministic finite automata

– Nondeterministic finite automata with Λ

• Kleene’s Theorem

• Pumping Lemma

• Closure properties

151

152 CHAPTER 27. SUMMARY OF THE COURSE

27.3 Part II – Context-Free Languages

• Regular grammars

• Context-free grammars

• Normal forms

• Recursive descent parsing

• LL(1) grammars

• Pushdown automata

• Deterministic and nondeterministic PDAs

• Top-down and bottom-up parsers

• Non-CF languages

• Closure properties

27.4 Part III – Turing Machines

• Origins and definition of Turing machines

• Universal machines

• Computable and computably enumerable languages

• Undecidable problems

• Chomsky hierarchy

• Variations on Turing machines

27.5 COMP 202 exam

According to the University’s www pagehttp://www.vuw.ac.nz/timetables/exam-timetable.aspx

the three-hour final exam is:

on Monday 31 October,in HMLT206,

starting at 9:30am

The exam will cover the whole course but will have slightly more emphasison the second half, as the mid-term test examined to first half.

The format will be similar, though not necessarily identical, to the last twoyears’ exams, which may be found on the course web site.

In particular, you will be asked to:

http://www.vuw.ac.nz/timetables/exam-timetable.aspx

27.6. WHAT NEXT? 153

• give definitions;

• state theorems;

• outline proofs;

• describe constructions;

• carry out constructions; and

• discuss the significance of results.

27.6 What next?

Some obvious directions to look in are:

• How can we use specifications and verification in a constructive way, tohelp us design correct programs?

• How can we write grammars which are convenient to use?

• Different models of computation give us different formalisms for writingprograms: what are the advantages and disadvantages of each?

• Given a decidable problem: can we say how hard it is to solve it?

• Problems which are undecidable in general may be decidable in specialcases: how can we characterise these?

Lecture Notes by Leslie and Nickson 2005

Documents

recursive

context free

turing machines

deterministic

computably

pumping lemma

pumping lemma

universal