Automata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International Tbilisi Summer School in Logic and Language Tbilisi, Georgia September 2013 1/ 56
Automata and Formal Language Theory
Stefan HetzlInstitute of Discrete Mathematics and Geometry
Vienna University of Technology
9th International Tbilisi Summer School in Logic and Language
Tbilisi, Georgia
September 2013
1/ 56
Introduction
I Formal and natural languages
I How to specify a formal language?I AutomataI Grammars
I Strong connections to:I Computability theoryI Complexity theory
I Applications in computer science:I VerificationI Compiler constructionI Data formats
2/ 56
Introduction
I Formal and natural languagesI How to specify a formal language?
I AutomataI Grammars
I Strong connections to:I Computability theoryI Complexity theory
I Applications in computer science:I VerificationI Compiler constructionI Data formats
2/ 56
Introduction
I Formal and natural languagesI How to specify a formal language?
I AutomataI Grammars
I Strong connections to:I Computability theoryI Complexity theory
I Applications in computer science:I VerificationI Compiler constructionI Data formats
2/ 56
Introduction
I Formal and natural languagesI How to specify a formal language?
I AutomataI Grammars
I Strong connections to:I Computability theoryI Complexity theory
I Applications in computer science:I VerificationI Compiler constructionI Data formats
2/ 56
Outline
I Deterministic finite automata
I Nondeterministic finite automata
I Automata with Ξ΅-transitions
I The class of regular languages
I The pumping lemma for regular languages
I Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
3/ 56
Finite Automata β A First Example
lockedonmlhijk unlockedonmlhijkcard
**
push
jj
push**
cardtt
4/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc
!
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab
%
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac
%
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac
%
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac
%
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
Finite Automata β A More Abstract Example
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
abbbcc !
aab %
ac %
The language accepted by this automaton is
L = {akbncm | k , n,m β₯ 1}
5/ 56
The Error State
// ?>=<89:;q0a // ?>=<89:;q1
a
οΏ½οΏ½b // ?>=<89:;q2
b
οΏ½οΏ½c // ?>=<89:;76540123q3
c
οΏ½οΏ½
6/ 56
The Error State
// ?>=<89:;q0a //
b,c''OOOOOOOOOOOOOOOO ?>=<89:;q1
a
οΏ½οΏ½b //
c
οΏ½οΏ½??
????
???
?>=<89:;q2
b
οΏ½οΏ½c //
a
οΏ½οΏ½
?>=<89:;76540123q3
c
οΏ½οΏ½
a,bοΏ½οΏ½οΏ½οΏ½
οΏ½οΏ½οΏ½οΏ½
οΏ½οΏ½οΏ½
?>=<89:;qe
a,b,c
JJ
6/ 56
Deterministic Finite Automata β Definition
DefinitionA deterministic finite automaton (DFA) is a tupleA = γQ,Ξ£, Ξ΄, q0,F γ where:
1. Q is a finite set (the states).
2. Ξ£ is a finite set (the input symbols).
3. Ξ΄ : Q Γ Ξ£β Q (the transition function).
4. q0 β Q (the starting state)
5. F β Q (the final states).
7/ 56
DFA β Example
// ?>=<89:;q0a //
b,c''OOOOOOOOOOOOOOOO ?>=<89:;q1
a
οΏ½οΏ½b //
c
οΏ½οΏ½??
????
???
?>=<89:;q2
b
οΏ½οΏ½c //
a
οΏ½οΏ½
?>=<89:;76540123q3
c
οΏ½οΏ½
a,bοΏ½οΏ½οΏ½οΏ½
οΏ½οΏ½οΏ½οΏ½
οΏ½οΏ½οΏ½
?>=<89:;qe
a,b,c
JJ
as tuple: BB1
8/ 56
The Language of a DFA
DefinitionExtend Ξ΄ : Q Γ Ξ£β Q to Ξ΄Μ : Q Γ Ξ£β β Q as follows.
Ξ΄Μ(q,w) =
{q if w = Ξ΅
Ξ΄(Ξ΄Μ(q, v), x) if w = vx for v β Ξ£β, x β Ξ£
Example
Ξ΄Μ(abc, q0) = q3, Ξ΄Μ(aba, q0) = qe
DefinitionLet A = γQ,Ξ£, Ξ΄, q0,F γ be a DFA. The language accepted by A is
L(A) = {w β Ξ£β | Ξ΄Μ(q0,w) β F}.
9/ 56
The Language of a DFA
DefinitionExtend Ξ΄ : Q Γ Ξ£β Q to Ξ΄Μ : Q Γ Ξ£β β Q as follows.
Ξ΄Μ(q,w) =
{q if w = Ξ΅
Ξ΄(Ξ΄Μ(q, v), x) if w = vx for v β Ξ£β, x β Ξ£
Example
Ξ΄Μ(abc, q0) = q3, Ξ΄Μ(aba, q0) = qe
DefinitionLet A = γQ,Ξ£, Ξ΄, q0,F γ be a DFA. The language accepted by A is
L(A) = {w β Ξ£β | Ξ΄Μ(q0,w) β F}.
9/ 56
The Language of a DFA
DefinitionExtend Ξ΄ : Q Γ Ξ£β Q to Ξ΄Μ : Q Γ Ξ£β β Q as follows.
Ξ΄Μ(q,w) =
{q if w = Ξ΅
Ξ΄(Ξ΄Μ(q, v), x) if w = vx for v β Ξ£β, x β Ξ£
Example
Ξ΄Μ(abc, q0) = q3, Ξ΄Μ(aba, q0) = qe
DefinitionLet A = γQ,Ξ£, Ξ΄, q0,F γ be a DFA. The language accepted by A is
L(A) = {w β Ξ£β | Ξ΄Μ(q0,w) β F}.
9/ 56
Designing an DFA
L = {w β {a, b}β | w contains an even number of aβs
and an even number of bβs}
BB2
10/ 56
Outline
X Deterministic finite automata
β Nondeterministic finite automata
I Automata with Ξ΅-transitions
I The class of regular languages
I The pumping lemma for regular languages
I Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
11/ 56
Nondeterministic Finite Automata β A Motivating Example
Automaton for accepting L = {wab | w β {a, b}β} ?
// ?>=<89:;q0
a,b
οΏ½οΏ½a // ?>=<89:;q1
b // ?>=<89:;76540123q2
Nondeterminism =β consider all possible runs
12/ 56
Nondeterministic Finite Automata β A Motivating Example
Automaton for accepting L = {wab | w β {a, b}β} ?
// ?>=<89:;q0
a,b
οΏ½οΏ½a // ?>=<89:;q1
b // ?>=<89:;76540123q2
Nondeterminism =β consider all possible runs
12/ 56
Nondeterministic Finite Automata β A Motivating Example
Automaton for accepting L = {wab | w β {a, b}β} ?
// ?>=<89:;q0
a,b
οΏ½οΏ½a // ?>=<89:;q1
b // ?>=<89:;76540123q2
Nondeterminism =β consider all possible runs
12/ 56
Nondeterministic Finite Automata β Definition
DefinitionA nondeterministic finite automaton (NFA) is a tupleA = γQ,Ξ£,β, q0,F γ where:
1. Q is a finite set (the states).
2. Ξ£ is a finite set (the input symbols).
3. β β Q à Σà Q (the transition relation).
4. q0 β Q (the starting state)
5. F β Q (the final states).
13/ 56
NFA β Example
// ?>=<89:;q0
a,b
οΏ½οΏ½a // ?>=<89:;q1
b // ?>=<89:;76540123q2
as tuple: BB3
14/ 56
The Language of an NFA
DefinitionExtend β β Q à Σà Q to βΜ : Q Γ Ξ£β Γ Q as follows.
(q,w , q) β βΜ if w = Ξ΅
(q,w , qβ) β βΜ if w = vx for v β Ξ£β, x β Ξ£,
and (q, v , qβ²) β βΜ, and (qβ², x , q) β β
Example
(q0, ab, q0), (q0, ab, q2) β βΜ, (q0, bb, q0) β βΜ
DefinitionLet A = γQ,Ξ£,β, q0,F γ be a NFA. The language accepted by A is
L(A) = {w β Ξ£β | βq β F s.t. (q0,w , q) β βΜ}.
15/ 56
The Language of an NFA
DefinitionExtend β β Q à Σà Q to βΜ : Q Γ Ξ£β Γ Q as follows.
(q,w , q) β βΜ if w = Ξ΅
(q,w , qβ) β βΜ if w = vx for v β Ξ£β, x β Ξ£,
and (q, v , qβ²) β βΜ, and (qβ², x , q) β β
Example
(q0, ab, q0), (q0, ab, q2) β βΜ, (q0, bb, q0) β βΜ
DefinitionLet A = γQ,Ξ£,β, q0,F γ be a NFA. The language accepted by A is
L(A) = {w β Ξ£β | βq β F s.t. (q0,w , q) β βΜ}.
15/ 56
The Language of an NFA
DefinitionExtend β β Q à Σà Q to βΜ : Q Γ Ξ£β Γ Q as follows.
(q,w , q) β βΜ if w = Ξ΅
(q,w , qβ) β βΜ if w = vx for v β Ξ£β, x β Ξ£,
and (q, v , qβ²) β βΜ, and (qβ², x , q) β β
Example
(q0, ab, q0), (q0, ab, q2) β βΜ, (q0, bb, q0) β βΜ
DefinitionLet A = γQ,Ξ£,β, q0,F γ be a NFA. The language accepted by A is
L(A) = {w β Ξ£β | βq β F s.t. (q0,w , q) β βΜ}.
15/ 56
Equivalence of DFA and NFA
TheoremLet L β Ξ£β, then there is a DFA D with L(D) = L iff there is aNFA N with L(N) = L.
Proof (BB4).
1. Converting D to N: easy.
2. Converting N to D: subset construction.
16/ 56
The Subset Construction β Example
Automaton for accepting L = {wab | w β {a, b}β}:
// ?>=<89:;q0
a,b
οΏ½οΏ½a // ?>=<89:;q1
b // ?>=<89:;76540123q2
Conversion to DFA: BB5
In practice: only construct reachable states
17/ 56
The Subset Construction β Lower Bound
TheoremThere are Ln β Ξ£β, n β₯ 1 and NFA Nn with n + 1 states withL(Nn) = Ln s.t. all DFA Dn with L(Dn) = Ln have at least 2n
states.
Proof.Let Ξ£ = {a, b} and for n β₯ 1 define
Ln = {wav | w , v β Ξ£β, |v | = n β 1}
BB6
18/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
β Automata with Ξ΅-transitions
I The class of regular languages
I The pumping lemma for regular languages
I Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
19/ 56
Epsilon-Transitions β A Motivating Example
Automaton for accepting decimal representations of integers:
Ξ£ = {β, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+
βͺ {βw | w β {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}
20/ 56
Epsilon-Transitions β A Motivating Example
Automaton for accepting decimal representations of integers:
Ξ£ = {β, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+
βͺ {βw | w β {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}
// ?>=<89:;q10β9
// ?>=<89:;76540123q2
0β9οΏ½οΏ½
20/ 56
Epsilon-Transitions β A Motivating Example
Automaton for accepting decimal representations of integers:
Ξ£ = {β, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+
βͺ {βw | w β {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}
// ?>=<89:;q0β,Ξ΅
// ?>=<89:;q10β9
// ?>=<89:;76540123q2
0β9οΏ½οΏ½
20/ 56
Ξ΅-NFA β Definition
DefinitionA nondeterministic finite automaton with Ξ΅-transitions (Ξ΅-NFA) isa tuple A = γQ,Ξ£,β, q0,F γ where:
1. Q is a finite set (the states).
2. Ξ£ is a finite set (the input symbols).
3. β β Q Γ (Ξ£ βͺ {Ξ΅})Γ Q (the transition relation).
4. q0 β Q (the starting state)
5. F β Q (the final states).
21/ 56
Ξ΅-NFA β Properties
DefinitionTransition relation βΜ: includes Ξ΅-transitions.
DefinitionLet A = γQ,Ξ£,β, q0,F γ be a Ξ΅-NFA. The language accepted by Ais
L(A) = {w β Ξ£β | βq β F s.t. (q0,w , q) β βΜ}.
TheoremLet L β Ξ£β, then there is a DFA D with L(D) = L iff there is aΞ΅-NFA N with L(N) = L.
Proof.Modified subset construction (Ξ΅-closed subsets).
22/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
β The class of regular languages
I The pumping lemma for regular languages
I Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
23/ 56
The Class of Regular Languages
Corollary
Let L β Ξ£β. The following are equivalent:
I There is a DFA D with L(D) = L.
I There is a NFA N with L(N) = L.
I There is a Ξ΅-NFA N β² with L(N β²) = L.
DefinitionL β Ξ£β is called regular language if there is a finite automaton Awith L(A) = L.
24/ 56
Closure Properties of Regular Languages
TheoremIf L1, L2 β Ξ£β are regular, then L1 βͺ L2 is regular.
Proof.BB7
TheoremIf L β Ξ£β is regular, then Lc = Ξ£β \ L is regular.
Proof.BB8
TheoremIf L1, L2 β Ξ£β are regular, then L1 β© L2 is regular.
Proof.L1 β© L2 = (Lc
1 βͺ Lc2)c.
25/ 56
Closure Properties of Regular Languages
TheoremIf L1, L2 β Ξ£β are regular, then L1 βͺ L2 is regular.
Proof.BB7
TheoremIf L β Ξ£β is regular, then Lc = Ξ£β \ L is regular.
Proof.BB8
TheoremIf L1, L2 β Ξ£β are regular, then L1 β© L2 is regular.
Proof.L1 β© L2 = (Lc
1 βͺ Lc2)c.
25/ 56
Closure Properties of Regular Languages
TheoremIf L1, L2 β Ξ£β are regular, then L1 βͺ L2 is regular.
Proof.BB7
TheoremIf L β Ξ£β is regular, then Lc = Ξ£β \ L is regular.
Proof.BB8
TheoremIf L1, L2 β Ξ£β are regular, then L1 β© L2 is regular.
Proof.L1 β© L2 = (Lc
1 βͺ Lc2)c.
25/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
β The pumping lemma for regular languages
I Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
26/ 56
The Pumping Lemma for Regular Languages
Lemma (Pumping Lemma)
Let L be a regular language. Then there is an n β N s.t. for everyw β L with |w | β₯ n we have w = v1v2v3 with
1. v2 6= Ξ΅,
2. |v1v2| β€ n, and
3. for all k β₯ 0 also v1vk2 v3 β L.
Proof.BB9
27/ 56
Non-Regular Languages
Lemma (Pumping Lemma)
Let L be a regular language. Then there is an n β N s.t. for everyw β L with |w | β₯ n we have w = v1v2v3 with
1. v2 6= Ξ΅,
2. |v1v2| β€ n, and
3. for all k β₯ 0 also v1vk2 v3 β L.
Example
L = {ambm | m β₯ 1} is not regular (BB10).
28/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
X The pumping lemma for regular languages
β Context-free grammars and languages
I Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
29/ 56
Context-Free Grammars β A First Example
How can we specify the set of all arithmetical expressions?E.g. 12, 30 + 21 Β· 6, (123 + 7) Β· 15 + 88, . . .
E β N | E + E | E Β· E | (E )
N β D | DN
D β 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
30/ 56
Context-Free Grammars β Definition
DefinitionA context-free grammar (CFG) is a tuple G = γN,T ,P,Sγ where
1. N is a finite set of symbols (the nonterminals),
2. T is a finite set of symbols (the terminals),
3. P is a finite set of production rules of the form:
Aβ w where A β N and w β (N βͺ T )β
4. S β N (the start symbol).
31/ 56
Context-Free Grammars β Example
G = γNT ,T ,P,SγNT = {N,D,E}
T = {+, Β·, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}P = {E β N,E β E + E ,E β E Β· E ,E β (E ),
N β D,N β DN,
D β 0,D β 1,D β 2,D β 3,D β 4,
D β 5,D β 6,D β 7,D β 8,D β 9}S = E
32/ 56
The Language of a Grammar
Let G = γN,T ,P, Sγ be a CFG.
DefinitionFor every Aβ w β P and every uAv β (N βͺ T )β define
uAv βG uwv .
The derivation relation ββG is the reflexive and transitive closure ofβG .
DefinitionThe language of G is L(G ) = {w β T β | S ββG w}.
DefinitionL β Ξ£β is called context-free if there is a context-free grammar Gwith L(G ) = L.
33/ 56
Context-Free Grammars β Example Derivation
E β N | E + E | E Β· E | (E )
N β D | DN
D β 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Example derivation: BB11
34/ 56
Formalisms
Automata Grammars
Regular Languages DFAs, NFAs ?
Context-Free Languages ? CFG
35/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
X The pumping lemma for regular languages
X Context-free grammars and languages
β Right linear grammars
I Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
36/ 56
Right Linear Grammars
DefinitionA grammar G = γN,T ,P, Sγ is called right linear if all productionsare of one of the following forms:
Aβ xB where x β T ,B β N
Aβ x where x β T
Aβ Ξ΅
TheoremLet L β Ξ£β. Then L is regular iff L has a right linear grammar.
Proof (BB12).
1. From right linear grammar to Ξ΅-NFA.
2. From NFA to right linear grammar.
Remark: notion of left linear grammars with analogous result37/ 56
Right Linear Grammars β Example
Let L = {ambn | m, n β₯ 0} (BB13)
Right linear grammar
Automaton
38/ 56
Formalisms
Automata Grammars
Regular Languages DFA, NFA LLG, RLG
Context-Free Languages ? CFG
39/ 56
Regular vs. Context-Free Languages
TheoremEvery regular language is context-free.
Proof.Every regular language has a right linear grammar. Every rightlinear grammar is context-free.
TheoremThere is a context-free language which is not regular.
Proof.L = {anbn | n β₯ 1} is not regular (pumping lemma), butS β ab | aSb is a context-free grammar for L.
40/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
X The pumping lemma for regular languages
X Context-free grammars and languages
X Right linear grammars
β Pushdown Automata
I The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
41/ 56
Automata for Context-Free Languages ?
I Well-balanced strings of parentheses, e.g.(), (()()), ((()())()) βW but((), )(() /βW
I Context-free grammar for W : E β EE | (E ) | Ξ΅I W is not regular (by pumping lemma, BB14)
I Generating a language vs. accepting a language
I How to accept a well-balanced string? (BB15)
=β Need more than constant memory=β For context-free languages: automaton with a stack
42/ 56
Automata for Context-Free Languages ?
I Well-balanced strings of parentheses, e.g.(), (()()), ((()())()) βW but((), )(() /βW
I Context-free grammar for W : E β EE | (E ) | Ξ΅I W is not regular (by pumping lemma, BB14)
I Generating a language vs. accepting a language
I How to accept a well-balanced string? (BB15)=β Need more than constant memory=β For context-free languages: automaton with a stack
42/ 56
Pushdown Automata β Definition
DefinitionA pushdown automaton (PDA) is a tupleA = γQ,Ξ£, Ξ, Ξ΄, q0,Z0,F γ where:
1. Q is a finite set (the states).
2. Ξ£ is a finite set (the input symbols).
3. Ξ is a finite set (the stack symbols).
4. β β Q Γ (Ξ£ βͺ {Ξ΅})Γ ΞΓ Q Γ Ξβ (the transition relation).where for each (q, x , z) β Q Γ (Ξ£ βͺ {Ξ΅})Γ Ξ there are onlyfinitely many (qβ²,w) β Q Γ Ξβ s.t. (q, x , z , qβ²,w) β β.
5. q0 β Q (the starting state).
6. Z0 β Ξ (the starting stack symbol).
7. F β Q (the final states).
43/ 56
Computation of a PDA
Let A = γQ,Ξ£, Ξ, Ξ΄, q0,Z0,F γ be a PDA.
DefinitionA configuration of A is a triple (q,w , v) β Q Γ Ξ£β Γ Ξβ where
I q is the current state,
I w is the remaining input, and
I v is the current stack contents.
Convention: top of the stack is on the left
DefinitionDefine the binary step relation `A on configurations of A as:
(q, xw , yv) `A (p,w , uv) if (q, x , y , p, u) β β
Define `βA as reflexive and transitive closure of `A.
44/ 56
The Language of a PDA
DefinitionLet A = γQ,Ξ£, Ξ, Ξ΄, q0,Z0,F γ be a PDA, then the languageaccepted by A by final state is defined as:
L(A) = {w β Ξ£β | (q0,w ,Z0) `βA (q, Ξ΅, v) for some q β F and any v}
DefinitionLet A = γQ,Ξ£, Ξ, Ξ΄, q0,Z0,F γ be a PDA, then the languageaccepted by A by empty stack is defined as:
N(A) = {w β Ξ£β | (q0,w ,Z0) `βA (q, Ξ΅, Ξ΅) for any q}
45/ 56
Example
Ξ£ = {(, )}, Ξ = {Z0, 1}
?>=<89:;q1
), 1/Ξ΅(, 1/11
οΏ½οΏ½
Ξ΅,Z0/Z0
οΏ½οΏ½
// ?>=<89:;76540123q0
(,Z0/1Z0
VV
Derivation of (()()): BB16
46/ 56
Final-State Acceptance vs. Null-Stack Acceptance
TheoremLet L β Ξ£β, then the following are equivalent:
1. There is a PDA AN with N(AN) = L.
2. There is a PDA AF with L(AF ) = L.
Proof (BB17).
1β 22β 1
47/ 56
Pushdown Automata and Context-Free Grammars
TheoremA language L has a context-free grammar iff it has a PDA.
Proof (BB18).
1. Given grammar G , construct PDA A.
2. Given PDA A, construct grammar G .
48/ 56
Formalisms
Automata Grammars
Regular Languages DFA, NFA LLG, RLG
Context-Free Languages PDA CFG
49/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
X The pumping lemma for regular languages
X Context-free grammars and languages
X Right linear grammars
X Pushdown Automata
β The pumping lemma for context-free languages
I Grammars in computer science
I Further topics
50/ 56
The Pumping Lemma for Context-Free Languages
Lemma (Pumping Lemma)
Let L be a context-free language. Then there is an n β N s.t. forevery w β L with |w | β₯ n we have w = v1v2v3v4v5 with
1. |v2v3v4| β€ n,
2. v2v4 6= Ξ΅, and
3. for all k β₯ 0 also v1vk2 v3vk
4 v5 β L.
Proof Sketch (BB19).
51/ 56
Non-Context-Free Languages
Lemma (Pumping Lemma)
Let L be a context-free language. Then there is an n β N s.t. forevery w β L with |w | β₯ n we have w = v1v2v3v4v5 with
1. |v2v3v4| β€ n,
2. v2v4 6= Ξ΅, and
3. for all k β₯ 0 also v1vk2 v3vk
4 v5 β L.
Example
L = {ambmcm | m β₯ 1} is not context-free (BB20).
52/ 56
Outline
X Deterministic finite automata
X Nondeterministic finite automata
X Automata with Ξ΅-transitions
X The class of regular languages
X The pumping lemma for regular languages
X Context-free grammars and languages
X Right linear grammars
X Pushdown Automata
X The pumping lemma for context-free languages
β Grammars in computer science
I Further topics
53/ 56
HTML is a Context-free Language
Source of website
Charβ a | A | b | B | Β· Β· Β·Stringβ Ξ΅ | Char String
Elementβ Heading | Paragraph | Link | String | <br> | Β· Β· Β·Elementsβ Ξ΅ | Element Elements
Headingβ <h3> String </h3> | Β· Β· Β·Paragraphβ <p> Elements </p>
Linkβ <a href = β Stringβ> String </a>
...
54/ 56
XML
Generalization: XML (Extensible Markup Language)
I DTD (Document Type Definition) is a grammar
I There are DTDs for:HTML, office formats, mathematical formulas, address data,vector graphics, cooking recipes, formal proofs, . . .
I Very rich infrastructure available
55/ 56
Further Topics
I Context-sensitive languages: uAv β uwv
I Regular expressions
I Decidability/complexity of, e.g., membership, emptyness, . . .
I Parser generators
I . . .
Introductory Textbook:J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction toAutomata Theory, Languages, and Computation, 2nd edition,2001, Addison-Wesley.
56/ 56