Top Banner
C SC 473 Automata, Grammars & Languages Automata, Grammars and Languages Discourse 04 Context-Free Grammars and Pushdown Automata
65

Automata, Grammars and Languages

Jan 19, 2016

Download

Documents

Dwayne

Automata, Grammars and Languages. Discourse 04 Context-Free Grammars and Pushdown Automata. Backus-Naur Form Grammars (CFGs). Algol 60, Algol 68—first “block-structured” languages Ex: CF Grammar. ::= ::= s | ::= begin end - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages

Automata, Grammars and Languages

Discourse 04

Context-Free Grammarsand

Pushdown Automata

Page 2: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 2

Backus-Naur Form Grammars (CFGs)• Algol 60, Algol 68—first “block-structured” languages

• Ex: • CF Grammar

<program> ::= <block><statement> ::= s | <block><block> ::= begin <list> end<list> ::= <statement> ; <list> | <statement>

begin s ; begin s;s;s end ;s end

P BSS BB LL S LL S

→→→→→→

s

b e;

:G { ;}Σ = s, b, e,

Nonter

minals

=varia

bles

rules=productions

terminals

V

Start variable “S”

R

Page 3: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 3

Grammars are “Generators”

“yields” or “derives in one step” Apply one production to one variable in the string

nondeterministic

P B L⇑

⇒ ⇒ b e

S B L⇒ ⇒b e b e bb ee⇑bse S ⇒

⇑bb ee L

;S L⇓

⇒bb ee L

bbsee

; ;B L L L⇒ ⇒b e bb e e L ; L ⇒bs e L

;S L⇓

⇒b e ;S S ⇒b e L

; ;S S L⇓

⇒b e L

Page 4: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 4

• One possible derivation. Variable being rewritten at each stage is underscored

• two choices at each derivation step: Which variable (nonterminal) to be rewritten? Which rule with that variable as LHS to be applied?

• All possible terminal strings obtainable in this way make up L(G)

; ; ;; ; ; ; ; ;; ; ; ; ; ; ;; ; ; ; ; ; ; ; ; ; ;; ; ; ; ; ; ; ;

; ; ; ; ( )

P B L S L S S LS S S S B S S L S

L S L S LL S L L

SL G

⇒ ⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒∴ ∈

b e b e b eb e b e b b e ebs b e e bs b e se bs b e sebs bs e se bs bs e se bs bs s e sebs bs s e se bs bs s se se

bs bs s se se

A Particular Derivation

Page 5: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 5

Why CFGs?• Most natural or artificial (e.g. programming) languages are not regular

• We know that the latter language is not regular, so …

• Ex: C programs

( ) { | 1}n nL G n∗ ∗∩ = ≥b seb se

main(){}, main(){{}}, main(){{{}}},

main(){} main(){} | 1n n n∗ ∗∩ = ≥C

K

{ }

Page 6: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 6

Derivation (Parse) TreePB

b L eS L;s S ; L

SBsb eL

S L;s S ; L

s Ss; ; ; ;bs bs s se se

yield/frontier/terminal string =

Page 7: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 7

Derivation (Parse) TreePB

S L;s S ; L

B Ss

S L;s S ; L

s Ss

Lb e

b eL

Page 8: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 8

Derivation (Parse) Tree (cont’d)

PB

b L eS L;s S ; L

SBsb eL

S L;s S ; L

s Ss

1

2

3

456

7

8

9

10

11 12

13 14

15

Page 9: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 9

Context-Free Grammar• Defn 2.2: A context-free grammar G is a 4-

tuple

• is a finite set, the variables (nonterminals)• is a finite set disjoint from V, the terminals• is a finite set of rules, of the form

• is the start variable

• Ex: strings with balanced parentheses. Formally:

• Ex: informally• Variables = upper case• Terminals = lower case

( , , ),G V R S= ΣV

S V∈

0 ( , , ),G V R S= Σ S B={(,)} { }V BΣ = ={ , , ( )}R B B BB B Bε= → → →

, , ( )A w A V w V ∗→ ∈ ∈ ∪ Σ

0 : | | ( )G B B BB B Bε→ → →

tech

nicall

y, an

ordere

d

pair

(A, w)

Page 10: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 10

Yields & Derives Relations• Defn. The relation yields (derives in 1 step)

is defined as follows: if is a rule in R, then

• Defn: derives in k steps:• Defn: derives:

• In other words:

• Defn: A derivation (of n steps) from is any sequence of strings satisfying:

1 2 1

1 2 1

iff or ( )( ) ( )( )

k

k

u v u v u u uu u u u v

∗−

⇒ = ∃ ∃ ∃⇒ ⇒ ⇒ ⇒

LL

, ( )u v V uAv uwv∗∀ ∈ ∪ Σ ⇒

( ) ( )V V∗ ∗⇒ ⊆ ∪ Σ × ∪ Σ

A w→

( )k⇒( )∗⇒

0 1 2 ku u u u⇒ ⇒ ⇒L

0u

Page 11: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 11

Language Generated• Defn. The language generated by G is the set

of all terminal strings derived from S:

A partial derivation is one that starts with S and ends in a non-terminal string containing variables in V

• Ex:

Partial: Terminal or terminated:

( ) { | }L G w w S w∗ ∗= ∈ Σ ∧ ⇒

|| |

S aAS aA SbA SS ba

→→

S aAS aSbAS⇒ ⇒

S aAS aSbAS aabAS

aabbaS aabbaa

⇒ ⇒ ⇒⇒ ⇒

Page 12: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 12

Derivations and Parse Trees• Ex: 0 : | | ( )G B B BB B Bε→ → →

( ) (( )) (())(())( ) (())()

( ) () ( )()(( ))() (())()

B BB B B B B BBB BB B B B B

B

⇒ ⇒ ⇒ ⇒⇒ ⇒

⇒ ⇒ ⇒ ⇒⇒ ⇒

leftmost :

rightmost :

B BBB

BB

( BB)

( B )

BB

( BB)

( B )

BB

( BB)

ε( B )

BB

( BB)

ε

( B )

( B )

BB

( BB)

ε

( B )ε Notice: completed (terminated) parse

tree is the same for both derivations—thoughthe sequence “grows” differently

Page 13: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 13

Derivation Parse Tree• Proposition 1: For every (terminated or partial) derivation

there is an unique parse tree T with frontier constructible from D.

• Proposition 2: For every parse tree T in G and any traversal order that is top-down (visits parents before children), there is an unique derivation for the frontier of T from S, and it is constructible from T.

• Corollary 3: For every parse tree T in G there is an unique leftmost derivation constructible from T.

Pf: Pre-order traverse T, expanding variables as their nodes are visited.

1 2: nD S u u u⇒ ⇒ ⇒L

nu

Page 14: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 14

Ex: Leftmost DerivationET∗ F

( )T

( )

F

EE + T

E

E + T

F

x

T

∗ FTFx

x

TFx

Fx|

|

( )|

E E T T

T T F F

F E x

→ +→ ∗→

Page 15: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 15

Ex: Leftmost DerivationET∗ F

( )T

( )

F

EE + T

E

E + T

F

x

T

∗ FTFx

x

TFx

Fx

1

3

4

5

6

7

8

2

9

10

1112

1314

151617

1718

Preorder traversal

Page 16: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 18

Syntactic Ambiguity

• 2 distinct parse trees for same terminal string

• 2 distinct leftmost derivations for same terminal string

• Leftmost derivation parse tree 1-to-1

• A CFG is unambiguous wL(G) w has an unique parse tree (unique leftmost derivation)

| |E E E E E a→ + ∗

E

+ E

E E

E

a ∗a a

E

∗ E

a

E

E E+a a

a a a+ ∗

E E E a E

a E E a a E

a a a

⇒ + ⇒ + ⇒+ ∗ ⇒ + ∗ ⇒+ ∗

terminal string =

E E E E E E

a E E a a E

a a a

⇒ ∗ ⇒ + ∗ ⇒+ ∗ ⇒ + ∗ ⇒+ ∗

Page 17: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 19

Ex: Ambiguous Grammar--English

<Sent><NP><VP><NP><N>|<Adj><N><VP><V><Obj>|<V><AdvP><AdvP><Adv>|<AdvP><AdvP><Prep><Obj><Obj><Adj><N><N>fruit | flies | ……

Page 18: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 20

“Fruit flies like a banana”<Sent>

<NP> <VP>

<Adj> <N> <V> <Obj>

fruit flies like<Adj> <N>

a banana<Sent>

<NP> <VP>

<N> <V> <AdvP>

fruit flies<Prep> <Obj>

like <Adj> <N>

a banana

<Sent><NP><VP><NP><N>|<Adj><N><VP><V><Obj>|<V><AdvP><AdvP><Adv>|<AdvP><AdvP><Prep><Obj><Obj><Adj><N><N>fruit | flies | ……

Page 19: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 21

Right Linear Grammars & Regular Languages

• Defn: A CFG is right-linear iff each rule is of one the forms AwB or Aw where A, B are variables and w Σ*

Chomsky (1958) called these “Type 3”

• Thm: L is a regular language iff L=L(G) for some right-linear grammar G. There are algorithms for converting from finite automata to right-linear grammars, and conversely.

DFA M

NFA NReg. Expr

ERight-linear

Grammar

G

= conversion algorithm

Page 20: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 22

Right-Linear & Regular (cont’d)• Pf: () Assume L=L(M) where is a DFA. Construct with R having rule if in and rule

if is a final state. Claim: Pf: easy induction on n The proof direction follows since

• Pf: () Assume L=L(G) where is right-linear. Construct NFA where

is a new symbol. has the transition

if in R and transition if

( , , , , )M Q s F= Σ( , , , )G Q R s= Σ

p aq→ p qap ε→

p

(p, a1 L an) êMn (q, ε) i f f p ⇒ G

n a1 L anq.

( , , , )G V R S= Σ( , , , ,{ })N V S f= Σ

f V∉ A BwA wB→ A fw A w→

Page 21: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 23

Right-Linear & Regular (cont’d)• Claim: Pf: easy induction on n The proof direction follows since

A0 ⇒ G w1A1 ⇒ G w1w2A2 ⇒ G L ⇒ G w1 L wnAni f f(A0, w1 L wn) êM (A1, w2 L wn) êM L êM (An, ε)

S ⇒ G∗ wA ⇒ wx i f f

(s, w) êM∗ (A, x) êM (f , ε) W

Page 22: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 24

Ex: Right-Linear FA• Ex:

• Ex:

: ||

G S aA bA aB dSB dA

→→→

f A BS ab

d

a

d

1q 2qab b

0q:M a ,a b

0 0 1

1 0 2

2 2 2

( ): | || ||

G M q aq bqq aq bqq aq bq

εε

→→→

( ):N G

“useless” rules—canbe eliminated

Page 23: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 25

Pushdown Automaton• Defn 2.12: A pushdown automaton M is a 6-

tuple

• is a finite set, the states• is a finite, the input alphabet• is a finite set, the stack alphabet•

is the transition function • is the start state• is the set of accept (final)

states

( , , ), , ,M Q s F= Σ ΓQ

ΓΣ

: ( )Q Qε ε ε × Σ ×Γ → ×ΓPs Q∈F Q⊆

{ }{ }

ε

ε

εε

Σ = Σ ∪Γ = Γ ∪

Page 24: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 26

PushDown Automaton

input Σ*

Finite Control

p

1 2 3A A A K

1 2 1iaa a −Kseen to come

1i i na a a+ K

current input symbol

stack Γ*

Top Bottom (no end-marker supplied)

1 2 3 1 2 3( , , )p aaa A A AK K

configuration:

(state, rest of input,Stack )

ia ∈ Σ

iA ∈ Γ

Page 25: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 27

PDA (cont’d)

Finite Control

s

ε

w

( , , )s w εconfiguration:

Initially: start state

Page 26: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 28

PDA (cont’d)

Finite Control p

ax

(p, ax, Xα) ê (q, x, Yα)

configurations:

Transition:

( , ) ( , , )q Y p a X∈

Finite Control q

x

Page 27: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 29

PDA (cont’d)• Can have ε-move: consume no input

• Pop-move: erase top stack symbol

• Push-only move: ignore stack

• Any combination is possible

( , ) ( , , )q Y p X ε∈

, ,a X Yε ε ε= = =

(p, ax, Xα) ê (q, ax, Yα)

( , ) ( , , )q p a Xε ∈ (p, ax, Xα) ê (q, x, α)

( , ) ( , , )q Y p a ε∈ (p, ax, Xα) ê (q, x, YXα)

( , ) ( , , )q pε ε ε∈ (p, ax, Xα) ê (q, ax, Xα)

Page 28: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 30

Finite Control f

α

f F∈

Finally:

configuration:( , , )f ε α

PDA (cont’d)

•Defn: recognizes iff for some , and some

•Defn:

(s, w, ε)êM∗ (f , ε, α)

f F∈M wα ∗∈Γ

( ) { : accepts }L M w M w=

Page 29: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 31

Example: PDA• Recognizer for

( , , ), , ,M Q s F= Σ Γ { , , , }Q s p q f={ , }A BΓ =

{ , , }a b cΣ =

(s, abbcbba, ε) ê ( p, abbcbba, $) ê ( p, bbcbba, A$)

ê L ( p, cbba, BBA$) ê (q, bba, BBA$) ê (q, ba, BA$)

ê L (q, a, A$) ê (q, ε, $) ê (f , ε, ε) ñ

s

{ | { , }}RL wcw w a b ∗= ∈

{ }F f= , $ε ε → ,a Aε →,b Bε →

,c ε ε→,a A ε→,b B ε→

p

qf,$ε ε→

accepts

(s, acb, ε) ê ( p, acb, $) ê ( p, cb, A$) ê (q, b, A$) ñdoes not accept

(blocked)

Page 30: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 32

Example: PDA w/ nondeterminism

• Last example (palindromes with center-mark) was a deterministic PDA (DPDA)

• NPDA for

(s, aa, ε) ê ( p, aa, $) ê ( p, a, A$)ê ( p, ε, AA$)

ê (q, ε, AA$) ñ

(s, aa, ε) ê ( p, aa, $) ê ( p, a, A$)ê (q, a, A$)ê (q, ε, $) ê (f , ε, ε) ñ

s{ | { , }}

{ | { , } is a palindrome}

RL ww w a b

x x a b

=

=

, $ε ε → ,a Aε →,b Bε →

,ε ε ε→,a A ε→,b B ε→

p

qf,$ε ε→

does not accept(blocked) Nondeterministic

“guess”

Page 31: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 33

Example: PDA • Recall well-nested parentheses (()) (()())

(p, w, $) ê∗ ( p, ε, $) ⇔

∀ pr ef i xes x of w, |x|( ≥|x|) ∧ |w|( =|w|)

p

0 : | | ( )G B B BB B Bε→ → →

// if ( then +1(, Aε →s, $ε ε →

f,$ε ε→

// if ) then -1),A ε→

DPDA!

Page 32: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 34

2{ | 0} { | 0}n n m mL a b n a b m= ≥ ∪ ≥

Example: PDA

,b A ε→

, $ε ε →

, $ε ε →

,a Aε →

,b A ε→

,$ε ε→

,a Aε →

//push a 2nd A, Aε ε →

,b A ε→

• “guesses” which pattern• “checks” whether guess is correct• accepts iff correct guess that checks

s

(s, ε, ε)ê0(s, ε, ε) ∧

s ∈F ⇒ ε i s r ecogni zed

Page 33: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 35

CFG PDA• Thm 2.20: A language is CF a PDA recognizes it. There are algorithms for converting a grammar to an equivalent automaton, and conversely.

• Lemma 2.21: There is an algorithm for constructing, from any CFG G, a PDA M such that L(G) = L(M).

Pf: In constructing a PDA, we can permit, without losing generality, “multi- push” moves such as

where For we may break a multi-push into a sequence of single-push moves by introducing new states:

Henceforth we will allow multi-push moves in our PDAs.

1 2, ta A X X X→ L

, iA X ∈ Γ

, taA X→1, tXε ε −→ 1, Xε ε→

tn 2n1tn −• • •

Page 34: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 36

CFG PDA• Idea: use nondeterminism. Given G, construct PDA

P to Load S on stack & simulate a leftmost derivation on the stack:

When a variable symbol A comes to stack top, “guess” a grammar rule Aα , pop A and push α

When a terminal character comes to stack top, compare to next input symbol.

If they match, pop the top and advance the input (“check off”)

If they fail to match, jam (not an accepting computation)

If the input holds a word in L(G) and P guesses the correct leftmost derivation (rules to apply), then all the input characters will be checked off against those at the top of the stack and the stack will empty as the last input is checked off. Otherwise at some point the PDA will jam

Page 35: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 37

CFG PDA (cont’d)• Given construct

States: Input alphabet: Σ Stack alphabet: Start state: Accept states: Transition function:

Initialize stack: Simulate rules: Check off terminals: Detect null stack & accept:

start( , , , , , ):P Q q F= Σ Γstart loop{ , , }acceptQ q q q=

startq

( , , , )G V R S= Σ

accept{ }F q=

{$}V ∪ Σ ∪

start loop( , , ) {( , $)}q q S ε ε =

loop loop( ) ( , , ) {( , )}A R q A qα ε α∀ → ∈ =

loop loop( ) ( , , ) {( , )}a q a a q ε∀ ∈ Σ =

loop accept( , ,$) {( , )}q q ε ε=

,Aε α→,a a ε→

startq loopq acceptq, $Sε ε → ,$ε ε→

( )a∀ ∈ Σ ( )A Rα∀ → ∈

Page 36: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 38

CFG PDA (cont’d)• Ex:

startq loopq acceptq, $Sε ε → ,$ε ε→

0 : | |G S S SS S sSdε→ → →

,Sε ε→,S SSε →,S sSdε →

,s s ε→,d d ε→

S ⇒ L

∗ xAα ⇔ (ql oop, xu, S$) ê∗ (ql oop, u, Aα$)

∴ S ⇒ L∗ w ⇔ (ql oop, w, S$) ê

∗ (ql oop, ε, ε$)ê(qaccept , ε, ε)

Page 37: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 39

CFG PDA (cont’d)• G • P

(qst ar t, ssddsd, ε) ê

LS ⇒ (ql oop, ssddsd, S$) ê

LSS ⇒ (ql oop, ssddsd, SS$) ê

LsSdS ⇒ (ql oop, ssddsd, sSdS$) ê

(ql oop, sddsd, SdS$) ê

LssSddS ⇒ (ql oop, sddsd, sSddS$) ê

(ql oop, ddsd, SddS$) ê

LssddS ⇒ (ql oop, ddsd, ddS$) ê

(ql oop, dsd, dS$) ê

(ql oop, sd, S$) ê

LssddsSd ⇒ (ql oop, sd, sSd$) ê

Page 38: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 40

CFG PDA (cont’d)• G • P

(ql oop, d, Sd) ê

ssddsd (ql oop, d, d$) ê

(ql oop, ε, ε$) ê

( )ssddsd L G∴ ∈ ( )ssddsd LP∴ ∈

(qst ar t, ssddsd, ε) ê∗

(qaccept , ε, ε)LS ssddsd∗⇒

CFG leftmost derivation PDA computation

Page 39: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 41

PDA CFGLemma 2.27: There is an algorithm for constructing, from any PDA P, a CFG G such that L(G) = L(P).

Pf: Given a PDA we can convert it into a PDA with the following simplified structure:• it has only one accept state:

• add ε-transitions from multiple accept states

• it empties its stack just before entering the accept state:

•Loop on a state that just pops:

• each PDA transition is either a “pure push” or a “pure pop

- introduce new intermediate states

{ }acceptF q=,

acceptf qε ε ε→

→,

,pop pop

X

q q Xε ε→

∈ Γ→,a Xε →

,a X ε→

0( , , , , , )P Q q F= Σ Γ

Page 40: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 42

PDA CFG (cont’d)

becomes becomes

• Idea of proof: construct G with variables for each p and q in the set of states Q. Arrange that if

generates terminal string x, then PDA P started in state p with an empty stack on input string x has a computation that reaches state q with an empty stack. And conversely, if P started in state p with an empty stack has a computation on input string x that reaches state q with an empty stack, then

How does P, when started on an empty stack in state p, operate on an input string x, ending with an empty stack in state q ? First move must be a push Last move must be a pop

,a X Y→ , ,a X Yε ε ε→ →,a ε ε→ , ,a X Xε ε ε→ →

pqApqA

.pq GA x⇒

,a Xε →,b X ε→

Page 41: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 43

PDA CFG (cont’d)

Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (1) stack never empties

pq G rsA aA b⇒

p qa b

rs GA y⇒1 4 4 4 442 4 4 4 4 43

input

Stack height

r s

pq GA x⇒1 4 4 4 442 4 4 4 4 43

push X← pop X →

Fig. 1

Page 42: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 44

PDA CFG (cont’d)

Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (2) stack empties somewhere

pq G pr rqA A A⇒

p q

r q GA z⇒1 4 4 4 442 4 4 4 4 43

input

Stack height

r

pr GA y⇒1 4 4 4 442 4 4 4 4 43

Fig. 2

Page 43: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 45

PDA CFG (cont’d)

Construction. Given PDA construct with the following rules in R:

If

then

( , )( ) pq p qp q Q Q A A A∀ ∈ ∀ ∈ → r rr

&

pq rsA aA b→

0( , , , , ,{ })acceptP Q q q= Σ Γ

0,( , , , )

acceptq qG V R A= Σ

( ) ppp Q A ε∀ ∈ →( , , , )( )( , )p q r s Q X a b ε∀ ∈ ∀ ∈ Γ ∀ ∈ Σ

,a Xε →p r,b X ε→s q

Page 44: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 46

PDA CFG (cont’d)

Claim 2.30: If then Pf: by induction on a derivation in G length k.

Base: k=1. The only derivations of length 1 are and we haveStep: Assume (IH) true for derivations of k steps.

WantClaim true for derivations of k+1 steps. Suppose that . The first derivation

step is either ofthe form orCase . Then

with So IH By construction, since

is a rule of G,

pq G p qA A A⇒ r r

pq G rsA aA b⇒

pq GA x⇒ (p, x, ε) êP∗ (q, ε, ε)

kpq GA x⇒

1pp GA ε⇒

(p, ε, ε) êP∗ ( p, ε, ε)

1kpq GA x+⇒

x ayb=.k

rs GA y⇒ (r , y, ε) êP∗ (s, ε, ε) .

pq rsA aA b→

pq G rsA aA b⇒

Page 45: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 47

PDA CFG (cont’d)

Case . Then with

So IH

Putting these together:

pq G p qA A A⇒ r r

∴ ( p, x, ε) êP (r , yb, X) êP∗(s, b, X) êP (q, ε, ε)

x yz=& .k k

pr G rq GA y A z≤ ≤⇒ ⇒

(p, y, ε) êP∗ (r , ε, ε) &(r , z, ε) êP

∗ (q, ε, ε)

(p, yz, ε) êP∗ (r , z, ε) =(r , z, ε) êP

∗ (q, ε, ε) W

( ) ( , ) ( , , )&( , ) ( , , ).X r X p a q s b X ε ε ∀ ∈ ∈

Page 46: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 48

PDA CFG (cont’d)

Claim 2.31: If then

Pf: by induction on a computation in P of length k:

Base: k=0. The only computations of length 0 are where x = ε. By

construction

Step: Assume (IH) true for computations of k steps. Want

Claim true for computations of k+1 steps. Suppose that . Two

cases: either the stackdoes not empty in midst of this computation (Fig. 1) or

itBecomes empty during the computation (Fig. 2). Call

theseCase 1 and Case 2.

.pq GA x∗⇒ (p, x, ε) êP

∗ (q, ε, ε)

1 .pp GA ε⇒ (p, x, ε) êP

0 ( p, ε, ε)

(p, x, ε) êPk (q, ε, ε)

(p, x, ε) êPk +1 (q, ε, ε) .

Page 47: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 49

PDA CFG (cont’d)

Case 1: See Fig.1. The symbol X pushed in the 1st moveIs the same as that popped in the last move. Let the

1st

and last moves be governed by the push/pop transitions:

By construction, there is a rule in G

Let x = ayb. Since then

we must have By IH

Then Using we conclude

.pq rsA aA b→

.pq GA ayb x∗⇒ =.rs GA y∗⇒

(r , y, X) êPk−1 (s, ε, X)

( , ) ( , , )&( , ) ( , , ).r X p a q s b X ε ε ∈ ∈

(r , y, ε) êPk−1 (s, ε, ε) .

1pq G rsA aA b⇒

Page 48: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 50

PDA CFG (cont’d)

Case 2: See Fig.2. Let r be the intermediate state where the stack becomes empty. Then

By the IH, and

Since by construction there is a rule in G of the form

then

pq pr rqA A A→

pr GA y∗⇒

( , )y z x yz∃ =

(p, y, ε) êP≤k (r , ε, ε) &(r , z, ε) êP

≤k (q, ε, ε)

.rq GA z∗⇒

.pq G pr rq GA A A yz x∗⇒ ⇒ = W

Page 49: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 51

PDA CFG (cont’d)

Ex:

Rules of G:(1) push-pop pairs (1st kind): # #sf qqA A→

(, Aε →

s#, $ε →

f#,$ ε→

),A ε→

is a well-balanced string of parentheses{ {(, )}( ) }∈ *L P = #w# | w

q

s#, $ε →

q

q

(, Aε →

f

q

#,$ ε→

q

q),A ε→

q

:{ , , }

PQ s q f=

{ ,$}{#,(,)}AΓ =

Σ =

( )qq qqA A→

Page 50: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 52

PDA CFG (cont’d)Note: If

(p´ unreachable) then (abbreviated ).

Such variables are useless; all rules involving them on left

or right sides can be eliminated as useless productions. For

this grammar(2) Rules of the 2nd Kind (with useless rules

removed—only 10/27 survive) in the order s,q,f:

(p, −, −) ñP∗ ( ′p , −, −)

ss ss ss

sq ss sq

sq sq qq

sf ss sf

sf sq qf

sf sf ff

A A AA A AA A AA A AA A AA A A

→→→→→→

{ | }pp Gx A x∗′ ⇒ = ∅ ppA ′ = ∅

, ,fq qs fsA A A= ∅ = ∅ = ∅

qq qq qq

qf qq qf

qf qf ff

ff ff ff

A A AA A AA A AA A A

→→→→

Page 51: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 53

PDA CFG (cont’d)(2) Rules of the 3rd Kind:

Combining all rules with same LHS:

ss

qq

ff

AAA

εεε

→→→

||

# # | | |( )| |

||

ss ss ss

sq ss sq sq qq

sf qq ss sf sq qf sf ff

qq qq qq qq

qf qq qf qf ff

ff ff ff

A A AA A A A AA A A A A A A AA A A AA A A A AA A A

ε

ε

ε

→→→→→→

Page 52: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 54

PDA CFG (cont’d)Simplify: easy to see that

Substituting this into rules:

Eliminate useless rules like

ss

ff

AA

εε

==

X X→

# # |( )| |

sq sq qq

sf qq sq qf

qq qq qq qq

qf qq qf

A A AA A A AA A A AA A A

ε

→→→→

Page 53: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 55

PDA CFG (cont’d)Another kind of useless rule:

generate no terminal strings. Eliminate these variables

any and rules mentioning them. Final simplified grammar

is:

Note: chose to use endmarkers # for clarity, but these could have been ε, (input symbols can be anything in ) leading to the familiar grammar

# #( )| |

sf qq

qq qq qq qq

A AA A A A ε

→→

,sq qfA A

( )| |sf qq

qq qq qq qq

A AA A A A ε

→→

εΣ

Page 54: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 56

Closure Properties

Regular Ops. The CFLs are closed under , , Pf: Homework

Intersection. The CFLs are not closed under intersection.

Example: Consider the two CFLs

Then We will later see (CF Pumping Lemma) that this last is not a CFL. �

However, if is regular and is CF, then

is CF.

1 2{ | 0}, { | 0}n n n nL a b c n L a b n c∗ ∗= ⋅ ≥ = ≥ ⋅

1 2 { | 0}.n n nL L a b c n∩ = ≥

1L2L 1 2L L∩

W

Page 55: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 57

Closure Properties (cont’d)• Thm: The class of CFLs is closed under intersection with regular languages.

Pf: Assume and

Construction. Construct a “cross-product pda” M as follows:

where the transition function is defined by:

provided and

Machine M simulates the two given machines “in parallel”,

keeping each machine state in one component of the compound state [ , ].

1 1 1 1 1 1 1( ), ( , , , , , )L LP P Q s F= = Σ Γ

2 2 2 2 2 2 2( ), ( , , , , )R L M M Q s F= = Σ

1 2 1 2 1, 2 1 2( , , , ,[ ], )M Q Q s s F F= × Σ ∪ Σ Γ ×

1 2 1 2([ , ], ) ([ , ], , )p p Y q q a X∈

1 1 1( , ) ( , , )p Y q a X∈ 2 2 2( , )p q a=

Page 56: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 58

What is Not Context-Free?• PDA have a limited computing ability. They cannot, for example, recognize repeated strings like w#w or strings that “count” in more than 2 places, such as .

• We will show that some languages are not CF using a CF Pumping Lemma, which gives a property that all CFLs must have. Then, to show that a language L is not CF, we somehow argue that it lacks this pumping property.

• Closure properties of CFLs can sometimes be used to simplify non-CFLs and make a pumping argument easier.

n n na b c

Page 57: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 59

CF Pumping Lemma• Thm [Pumping Lemma for CFLs]. Suppose that

L is an infinite CF language. Then

• For comparison, here is the Regular P.L.:

( )( )[

( , , )(

( 0) )]i

p w w L w p

x y z w xyz y

xy p i xy z L

ε∃ ∀ ∈ ∧ ≥ ⇒∃ = ∧ ≠

∧ ≤ ∧ ∀ ≥ ∈

Page 58: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 60

CF Pumping Lemma (cont’d)• Pf: Let where CFG G is a CFG in Chomsky Normal Form (Text, Theorem 2.9), i.e. a CFG in which all rules are of the (schematic) forms ABC or Aa (a ε). If is “sufficiently long”, then any derivation tree T for w must contain a “long” path—more precisely:

• Claim 1: If the derivation tree T for has no path longer than h then

Pf: Induction on h. Base: h = 1. Only possible tree is

and Step: Assume Let T have all paths and be

of form (in CNF)

( )w L G∈

1| | 2 .hw −≤

1.h > h≤

{ } ( )L L Gε− =

w

0| | 2.a =

S

a

Page 59: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 61

CF Pumping Lemma (cont’d)

• Then have all paths of length By IH,

which implies .

Conversely, if a generated string is at least long,

then its parse tree must be at least high.

G has variables. Choose If

and then Claim 1 any parse tree T for w has

a path of length at least Such a path has at least nodes. ∴ some variable appears twice

on the path (note the leaf node is a terminal).

2 2| | 2 ,| | 2h hs t− −≤ ≤

S

1T 2TT =

s t w st=1 2,T T 1.h≤ −

1| | 2hw −≤2h

1h+

V 12 .Vp += w L∈w p≥

2.V +3V +

Page 60: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 62

CF Pumping Lemma (cont’d)• Picture:

12Vw p +≥ =

T =R

R

height nodes

variables repeat

2 3

2

V V

V

≥ + ⇒ ≥ +

⇒ ≥ + ⇒

R

RR

u v x y zw =

aht 2V≤ +

12V p+≤ =

ChooseBottom

variables

1V +

vxy p∴ ≤

1T

2T

Page 61: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 63

CF Pumping Lemma (cont’d)(1) Center portion is not too long:(2) Pumped portion not empty:

cannot both = ε.

1T

2T

vxy p≤,v y

=R

B C

2T

v x yx ε≠

R

B C

2T

v x yxε≠

or

In CNF, no variablegenerates ε

Page 62: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 64

CF Pumping Lemma (cont’d)(3) Pumped strings in L : the following are

all parse trees

R

R

( 0) i ii uv xy z L∴ ∀ ≥ ∈ W

R

R

R

R

R

R

R

u v x y z

u vv x yy z

u vvv x yyy z

R

u x z

and:

Page 63: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 65

CF Pumping: Applications• Ex: is not a CFL.

Pf: Suppose it is CF. Then the Pumping Lemma p

wL, |w|p uvxyz =w & vy ε & i0 u vi x yi z L.

Pick p as the constant guaranteed and choose n p/3 and

Where is

Cases: Assume first that

{ | 0}n n nL a b c n= ≥

, .n n nw a b c uvxyz vy ε= = ≠ ?vxy.v ε≠

| | | |

| | | |

| | | |

| | | |

| | | |

| | | |

a ab bc c

u v x y z

u v x y z

u v xyz

u v x y z

u v xyz

u v xyz

K K K K K K K K K

Page 64: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 66

CF Pumping: Applications• In cases 1-3 has an imbalance. In case 4 it has

a b before a. In case 5 it has a c before b. In case 6 it has an a after a c. In any case, there is a contradiction to the pumped

word being in L.

The case where is symmetric. Contradiction.

Cor. The CFLs are not closed under complementation. Pf: is a CFL.

But is not a CFL.

Therefore cannot be CF. Ex: is not CF. Proof similar to

regular case. Ex: is not CF.

y ε≠

2 2uv xy z

{ | }p q rL a b c p q q r p r= ≠ ∨ ≠ ∨ ≠{ | 0}i i iL a b c a b c i∗ ∗ ∗∩ = ≥

L{ | prime}iL a i=

2

{ | 0}iL a i= ≥

Page 65: Automata, Grammars and Languages

C SC 473 Automata, Grammars & Languages 67

Pumping: Applications (cont’d) Ex: is not CF.

Pf: Intersection with is not a CFL.

Therefore cannot be CF. Ex: is not CF.

Pf: By pumping on the word Similar to

Text, Example 2.38. Ex: is not CF.

Pf: Pump on the latter language in a way similar to the previous example to show it is not CF.

L

a b c{ { , , } | = = }L w a b c w w w∗= ∈

a b c∗ ∗ ∗

{ |i 1,j 1}i j i jL a b c d= ≥ ≥n n n nw a b c d=

{ | {0,1}}L ww w= ∈10 10 110 10 1 {10 10 110 10 1| , 1}n m n mL m n+ + + +∩ = ≥