Top Banner
CSC 3130: Automata theory and formal languages Andrej Bogdanov http://www.cse.cuhk.edu.hk/ ~andrejb/csc3130 The Chinese University of Hong Kong LR(k) grammars Fall 2008
24

CSC 3130: Automata theory and formal languages

Jan 29, 2016

Download

Documents

mahdis

Fall 2008. The Chinese University of Hong Kong. CSC 3130: Automata theory and formal languages. LR( k ) grammars. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. LR(0) example from last time. 4. A  aA•b. a. A. b. 2. 5. A  a•Ab A  a•b A  •aAb A  •ab. 1. A  aAb•. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSC 3130: Automata theory and formal languages

CSC 3130: Automata theory and formal languages

Andrej Bogdanov

http://www.cse.cuhk.edu.hk/~andrejb/csc3130

The Chinese University of Hong Kong

LR(k) grammars

Fall 2008

Page 2: CSC 3130: Automata theory and formal languages

LR(0) example from last time

A •aAbA •ab

A a•AbA a•bA •aAbA •ab

A aA•b

A aAb•

A ab•

ab

bAa

1

2

3

4

5

A aAb | ab

Page 3: CSC 3130: Automata theory and formal languages

LR(0) parsing example revisited

Stack Input

S

S

SRSR

11a2

1a2a2

1a2a2b31a2A41a2A4b51A

aabbabb

bb

bb

A S

A aAb | ab A aAb aabb

12

2

345

A

A •aAbA •ab A a•Ab

A a•bA •aAbA •ab

A aA•b A aAb•

A ab•

a

b

b

A

a12

3

4 5

Aa b

a b

• •

• •

• •

Page 4: CSC 3130: Automata theory and formal languages

Meaning of LR(0) items

A

A •Xundiscovered

part

NFA transitions to:

X •

X

focus

shift focus to subtree rooted at X(if X is nonterminal)

A X•move past subtreerooted at X

Page 5: CSC 3130: Automata theory and formal languages

Outline of LR(0) parsing algorithm

• Algorithm can perform two actions:

• What if:

no complete item

is valid

there is one valid item,and it is complete

shift (S) reduce (R)

some valid items

complete, some not

more than one valid

complete item

S / R conflict R / R conflict

Page 6: CSC 3130: Automata theory and formal languages

Definition of LR(0) grammar

• A grammar is LR(0) if S/R, R/R conflicts never occur– LR means parsing happens left to right and produces

a rightmost derivation

• LR(0) grammars are unambiguous and have a fastparsing algorithm

• Unfortunately, they are not “expressive” enoughto describe programming languages

Page 7: CSC 3130: Automata theory and formal languages

context-free grammarsparse using CYK algorithm (slow)

LR(∞) grammars

Hierarchy of context-free grammars

LR(1) grammars

LR(0) grammarsparse using LR(0) algorithm

javaperl

python…

Page 8: CSC 3130: Automata theory and formal languages

A grammar that is not LR(0)

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a

Page 9: CSC 3130: Automata theory and formal languages

A grammar that is not LR(0)

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

A

S

A B

A

aA

a a

A

a a

S S

ca

input:

possibilities:shift (3), reduce (4)reduce (5), shift (6)

• • •

valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a

a

S/R, R/R conflicts!

Page 10: CSC 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

A

S

A B

A

aA

a a

A

a a

S S

ca

input:

• • •

apeek inside!

valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a

Page 11: CSC 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a apeek inside!

valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a

A

A

a a

S

parse tree must look like this

action: shift

Page 12: CSC 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a a apeek inside!

valid LR(0) items:A a•A, A a• A •aA, A •a

parse tree must look like this

A

A

aA

a

S

•action: shift

Page 13: CSC 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a a a

valid LR(0) items:A a•A, A a• A •aA, A •a

parse tree must look like this

action: reduce

A

A

aA

a a

S

Page 14: CSC 3130: Automata theory and formal languages

LR(0) items vs. LR(1) items

A

A

a b

a b

Aa b•

A aAb | ab

A a•Ab

A

A

a b

a b

Aa b•

[A a•Ab, b]

LR(0) LR(1)

Page 15: CSC 3130: Automata theory and formal languages

LR(1) items

• LR(1) items are of the form

to represent this state in the parsing

[A •, x] [A •, ]or

x•

A

A

Page 16: CSC 3130: Automata theory and formal languages

Outline of LR(1) parsing algorithm

• Step 1: Build NFA that describes valid item updates

• Step 2: Convert NFA to DFA– As in LR(0), DFA will have shift and reduce states

• Step 3: Run DFA on input, using stack to remember

sequence of states– Use lookahead to eliminate wrong reduce items

Page 17: CSC 3130: Automata theory and formal languages

Recall NFA transitions for LR(0)

• States of NFA will be items (plus a start state q0)

• For every item S •we have a transition

• For every item A •X we have a transition

• For every item A •C and production C •

S •q0

A X•XA •X

C •A •C

Page 18: CSC 3130: Automata theory and formal languages

NFA transitions for LR(1)

• For every item [S •,] we have a transition

• For every item A •X we have a transition

• For every item [A •C, x] and production C

for every y in FIRST(x)

[S •,]q0

[A X•, x]X

[A •X, x]

[C •, y]

[A •C, x]

Page 19: CSC 3130: Automata theory and formal languages

FIRST sets

• Example

FIRST() is the set of terminals that occuron the left in some derivation starting from

S A(1) | cB(2) A aA(3) | a(4) B a(5) | ab(6)

FIRST(a) = {a}FIRST(A) = {a}FIRST(S) = {a, c}FIRST(bAc) = {b}FIRST(BA) = {a}FIRST() = ∅

Page 20: CSC 3130: Automata theory and formal languages

Explaining the transitions

[A X•, x]X

[A •X, x]

[C •, y]

[A •C, x]

A

C x

A

X x •

A

X x

y ∈ FIRST(x)

y

C

• •

Page 21: CSC 3130: Automata theory and formal languages

Example

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

[S •A,]

q0

[S •Bc,]

[S A•,]

A[A •aA,]

[B •a,c]

[S B•c,]

[B •ab,c]

. . .

B

[A •a,]

Page 22: CSC 3130: Automata theory and formal languages

Convert NFA to DFA

• Each DFA state is a subset of LR(1) items, e.g.

• States can contain S/R, R/R conflicts

• But lookahead can always resolve such conflicts

[A a•A, ] [A a•, ][B a•, c] [B a•b, c] [A •aA, ] [A •a, ]

Page 23: CSC 3130: Automata theory and formal languages

Example

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

stackinput

a

abBBcS

abc

bc

cc

A valid items

[S •A, ] [S •Bc, ] [A •aA, ] [A •a, ] [B •a, c] [B •ab, c]

S

SRSR

[A a•A, ] [A a•, ] [B a•, c] [B a•b, c] [A •aA, ] [A •a, ]

[B ab•, c] [S B•c, ][S Bc•, ]

look ahead!

Page 24: CSC 3130: Automata theory and formal languages

LR(k) grammars

• A context-free grammar is LR(1) if all S/R, R/Rconflicts can be resolved with one lookahead

• More generally, LR(k) grammars can resolve allconflicts with k lookahead symbols– Items have the form [A •, x1...xk]

• LR(1) grammars describe the semantics of mostprogramming languages