Top Banner
Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model for face representation and sketching," IEEE Trans. Pattern Analysis and Machine Intelligence(PAMI)'08. * * * *
124

Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Jan 28, 2016

Download

Documents

Betty Kelley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

1

Compilation 0368-3133

Lecture 4

Syntax AnalysisNoam Rinetzky

Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model for face representation and sketching," IEEE Trans. Pattern Analysis and Machine Intelligence(PAMI)'08.

*

*

*

*

Page 2: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

2

Where are we?

Executable

code

exe

Source

text

txtLexicalAnalysi

s

Sem.Analysis

Process text input

characters SyntaxAnalysi

s

tokens AST

Intermediate code

generation

Annotated AST

Intermediate code

optimization

IR CodegenerationIR

Target code optimizatio

n

Symbolic Instructions

SI Machine code

generation

Write executable

output

MI

LexicalAnalysi

s

SyntaxAnalysi

s✓✓ ﹅

Page 3: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

From scanning to parsing

3

((23 + 7) * x)

) x * ) 7 + 23 (

RP Id OP RP Num ( Num LP LP

Lexical Analyzer

characters (program text)

token stream

ParserContext free grammar: Exp ... |Exp + Exp | Id

Op(*)

Id(x)

Num(23) Num(7)

Op(+)

Abstract Syntax Treevalidsyntax

error

Regular language: Id ‘a’ | ... | ‘z’

Page 4: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

4

Broad kinds of parsers

• Top-Down parsers – Construct parse tree in a top-down matter– Find the leftmost derivation

• Bottom-Up parsers – Construct parse tree in a bottom-up manner– Find the rightmost derivation in a reverse order

• Parsers for arbitrary grammars– Earley’s method, CYK method– Usually, not used in practice (though might change)

Page 5: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

5

Page 6: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

6

Context free grammars (CFGs)

• V – non terminals (syntactic variables)• T – terminals (tokens)• P – derivation rules

• Each rule of the form V (T V)*

• S – start symbol

G = (V,T,P,S)

Page 7: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

7

Derivations

• Show that a sentence ω is in a grammar G by repeatedly applying a production rule

• Sentence αNβ • Rule Nµ• Derived sentence: αNβ αµβ

– µ1 * µ2 if µ1 … µ2

Page 8: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

8

Leftmost Derivationx := z;y := x + z

S S;S

S id := E

E id | E + E | E * E | ( E )

SS S;

id := E S;id := id S;id := id id := E ;id := id id := E + E ; id := id id := id + E ;id := id id := id + id ;

S S;SS id := EE idS id := EE E + E E id

E id

x := z ; y := x + z

Page 9: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

9

Rightmost Derivation

SS S;S id := E;S id := E + E;S id := E + id;S id := id + id ;

id := E id := id + id ;id := id id := id + id ;

<id,”x”> ASS <id,”z”> ;<id,”y”> ASS <id,”x”> PLUS <id,”z”>

S S;SS id := E | …E id | E + E | E * E | …

S S;SS id := EE E + EE id E id S id := E

E id <id,”x”> ASS <id,”z”> ; <id,”y”> ASS <id,”x”> PLUS <id,”z”>

Page 10: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

10

Parse treeS

S S;

id := E S;

id := id S;

id := id id := E ;

id := id id := E + E ;

id := id id := E + id ;

id := id id := id + id ;x:= z ; y := x + z

S

S

;

S

id :=

E

id

id := E

E

+

E

id id

Page 11: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

11

Ambiguity

x := y+z*wS S ; SS id := E | … E id | E + E | E * E | …

S

id := E

E + E

id

id

E * E

id

S

id := E

E*E

id

id

E + E

id

Page 12: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

12

Top-down parsing

• Begin with Start symbol• Apply production rules• Until desired word is derived

Page 13: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

13

Top-down parsing

• Begin with Start symbol• Apply production rules• Until desired word is derived• Can be implemented using recursion

Page 14: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

14

Recursive descent parsing

• Define a function for every nonterminal• Every function work as follows

– Find applicable production rule– Terminal function checks match with next input

token– Nonterminal function calls (recursively) other

functions

Page 15: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

15

Recursive descent parsing

• Define a function for every nonterminal• Every function work as follows

– Find applicable production rule– Terminal function checks match with next input

token– Nonterminal function calls (recursively) other

functions• If there are several applicable productions …

Page 16: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

16

Top-down parsing

Page 17: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

17

Recursive descent parsingwith lookahead

• Define a function for every nonterminal• Every function work as follows

– Find applicable production rule– Terminal function checks match with next input

token– Nonterminal function calls (recursively) other

functions• If there are several applicable productions

decide based on the next unmatched token

Page 18: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

18

Predictive parsing

• Recursive descent• LL(k) grammars

Page 19: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

19

A predictive (recursive descent) parser

E() { if (current {TRUE, FALSE}) LIT(); else if (current == LPAREN) match(LPAREN); E(); OP(); E(); match(RPAREN); else if (current == NOT) match(NOT); E(); else error();}

LIT() { if (current == TRUE)

match(TRUE); else if (current == FALSE)

match(FALSE); else error();}

E LIT | (E OP E) | not ELIT true | falseOP and | or | xor

match(token t) { if (current == t) current = next_token() else error();}

Reminder: Variable current holds the current input token

OP() { if (current == AND) match(AND); else if (current == OR) match(OR); else if (current == XOR) match(XOR); else error();}

Note: TRUE = token for “true” etc.

Page 20: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

20

What we want: Lookaheads!

E() { if (current {TRUE, FALSE}) LIT(); else if (current == LPAREN) match(LPAREN); E(); OP(); E(); match(RPAREN); else if (current == NOT) match(NOT); E(); else error();}

E ⟶ LIT | (E OP E) | not ELIT ⟶ true | falseOP ⟶ and | or | xor

Note: TRUE = token for “true” etc.

Page 21: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

21

Why we want it: Prediction Table!

• Given – Non terminal X– Derivation rules X ⟶ α1 | … | αk

– Terminal t• T[X,t] = αi if we should apply rule αi

Prediction table

Remember the colors

Page 22: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

22

How we get it?

• First FIRST • Then FOLLOW

– (then FIRST again …)

Page 23: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

23

FIRST Sets• FIRST(µ): The set of first tokens in words in L(µ)

• FIRST( LIT ) = { TRUE, FSLSE }• FIRST( ( E OP E ) ) = { LPAREN }• FIRST( not E ) = { NOT }

• X ⟶ α1 |…|αk

• FIRST(X) = FIRST(α1) … FIRST∪ ∪ (αk) { ∪ ℇ } if αi * ⟶ ℇ for some αi

E ⟶ LIT | (E OP E) | not E

Page 24: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

24

FIRST Sets• FIRST(X) = { t | X * ⟶ t β} { ∪ ℇ | X * ⟶ ℇ }

– all terminals t can appear as first in some derivation for X • Plus ℇ if it can be derived from X

• If for every α and β such that X⟶ ... α |…| β … FIRST(α)∩FIRST(β) = {} then we can always predict (choose) which rule to apply based on next token

Page 25: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

25

FIRST Sets• FIRST(X) = { t | X * ⟶ t β} { ∪ ℇ | X * ⟶ ℇ }

– all terminals t can appear as first in some derivation for X • Plus ℇ if it can be derived from X

• If for every α and β such that X⟶ ... α |…| β … FIRST(α)∩FIRST(β) = {} then we can always predict (choose) which rule to apply based on next token

X⟶ ... α |…| β … input = a…

a FIRST∈ (α) use X⟶ α

Page 26: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

26

Computing FIRST sets

• FIRST (t) = { t } – t is a non terminal – t is a sentential form

• ℇ ∈ FIRST(X) if – X ⟶ ℇ or – X A⟶ 1…Ak and ℇ FIRST(A∈ i) i=1..k

Page 27: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

27constraints

Computing FIRST sets (take I)

• Assume no null productions X …| ℇ| …

• Observation

If X … | ⟶ tα |…| Nβ Then { t } = FIRST(N) FIRST(⊆ X) and FIRST(N) FIRST(⊆ X)

Compute FIRST by solving the

constraint system

If we know the (minimal) solution to this

constraints system then we have first

If we know FIRST() then we have the

solution to this constraints system

Page 28: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

28

Fixed-point algorithm for computing FIRST sets

• Assume no null productions Xi …| ℇ| …

Initialization FIRST(Xi) = { t | Xi tβ for some β} i = 1..m

Body do for every Xi Xkβ

FIRST(Xi) = FIRST(Xi) ∪ FIRST(Xk) until FIRST(Xi) does not change for any Xi

Say we have m non-terminals

Page 29: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

29

FIRST sets constraints example

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

Initialization: F(STMT) = {if, while} F (EXPR) = {zero?, not, ++, --} F(TERM) = {id, constant}

do F’(STMT) = F(STMT); F’(EXPR) = F(EXPR); F’(TERM) = F(TERM); F(STMT) = F(STMT) ∪ F(EXPR); F(EXPR) = F(EXPR) ∪ F(TERM);

Until (F’(STMT) == F(STMT) && F’(EXPR) = F(EXPR) && F’(TERM) = F(TERM));

*F = FIRST

Page 30: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

30

FIRST sets computation exampleSTMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

TERM EXPR STMT

Page 31: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

31

1. Initialization

TERM EXPR STMTidconstant

zero?Not++--

ifwhile

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

Page 32: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

32

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

TERM EXPR STMT

idconstant

zero?Not++--

ifwhile

zero?Not++--

2. F(STMT) = F(STMT) ∪ F(EXPR)

Page 33: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

33

3. F(EXPR) = F(EXPR) ∪ F(TERM)

TERM EXPR STMTidconstant

zero?Not++--

ifwhile

idconstant

zero?Not++--

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

Page 34: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

34

4. F(STMT) = F(STMT) ∪ F(EXPR)

TERM EXPR STMTidconstant

zero?Not++--

ifwhile

idconstant

zero?Not++--

idconstant

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

Page 35: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

35

4. We reached a fixed-point

TERM EXPR STMTidconstant

zero?Not++--

ifwhile

idconstant

zero?Not++--

idconstant

STMT if EXPR then STMT | while EXPR do STMT | EXPR ;EXPR TERM -> id | zero? TERM | not EXPR | ++ id | -- idTERM id | constant

Page 36: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

36

Fixed-point algorithm for computing FIRST sets

• What to do with null productions?

X Y a | Z bY ℇ Z ℇ

• Say input=“a”, which rule to use?• a FIRST (∉ Y) , a FIRST (∉ Z)

Use what comes after

Y/Z

Page 37: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

37constraint

Computing FIRST sets (take II)

• Observation

If X ⟶ A1 .. Ak N α| … and ℇ FIRST(∈ A1) , … , ℇ FIRST(∈ Ak)

Then FIRST(N) \ { ℇ } FIRST(⊆ X)

ℇ a…

Use what comes after A1..Ak to predict which

production rule of X to use

Page 38: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

38

FOLLOW sets

• FOLLOW(N) = the set of tokens that can immediately follow the non-terminal N in some sentential form

If S * ➝ αNtβ then t ∈ FOLLOW(N)

p. 189

Page 39: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

39

FOLLOW sets

• FOLLOW(N) = the set of tokens that can immediately follow the non-terminal N in some sentential form

If S * ➝ αNtβ then t ∈ FOLLOW(N)

• FOLLOW(t) = the set … terminal t … form

If αNtβ * ➝ α’tqβ’then q ∈ FOLLOW(t)

p. 189

Page 40: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

40

FOLLOW sets: Constraints

• $ ∈ FOLLOW(S)

• If X α N βthen FIRST(β) – { ℇ } FOLLOW(⊆ N)

• If X α N β and ℇ ∈ FIRST(β)then FOLLOW(X) ⊆ FOLLOW(N)

End of input Start symbol Compute FIRST and FOLLOW by solving

the extended constraint system

Page 41: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

41

Example: FOLLOW sets

• E TX X+ E | ℇ• T (E) | int Y Y * T | ℇ

Terminal + ( * ) int

FOLLOW int, ( int, ( int, ( _, ), $ *, ), +, $

Non. Term.

E T X Y

FOLLOW ), $ +, ), $ $, ) _, ), $

Page 42: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

42

Prediction Table

• A α

• T[A,t] = α if t FIRST(∈ α)• T[A,t] = α if ℇ FIRST(∈ α) and t FOLLOW(∈ A)

– t can also be $

• T is not well defined the grammar is not LL(1)

Page 43: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

43

LL(k) grammars• A grammar is in class LL(k) iff

for every two productions Aα and Aβ – FIRST(α) ∩ FIRST(β) = {}

• In particular α*ℇ and β*ℇ is not possible – If β* ℇ then FIRST(α) ∩ FOLLOW(A) = {}

Page 44: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

44

Problem: Non LL(k) grammars

Page 45: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

45

LL(k) grammars• An LL(k) grammar G can be derived via:

– Top-down derivation– Scanning the input from left to right (L)– Producing the leftmost derivation (L)– With lookahead of k tokens (k)

– G is not ambiguous – G is not left-recursive

• A language is said to be LL(k) when it has an LL(k) grammar

Page 46: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

46

Non LL grammar: Common prefix

• FIRST(term) = { ID }• FIRST(indexed_elem) = { ID }

• FIRST/FIRST conflict

term ID | indexed_elemindexed_elem ID [ expr ]

Page 47: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

47

Solution: left factoring• Rewrite the grammar to be in LL(1)

Intuition: just like factoring x*y + x*z into x*(y+z)

term ID | indexed_elemindexed_elem ID [ expr ]

term ID after_IDAfter_ID [ expr ] |

Page 48: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

48

S if E then S else S | if E then S | T

S if E then S S’ | TS’ else S |

Left factoring – another example

Page 49: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

49

• FIRST(S) = { a } FOLLOW(S) = { $ } • FIRST(X) = { a, }FOLLOW(X) = { a }

• FIRST/FOLLOW conflict

S X a bX a |

Non LL grammar: Problematic null productions

T[X,a] = α if a FIRST(∈ a)T[X, a] = if ℇ FIRST(∈ ℇ) and a FOLLOW(∈ X)

t can also be $

T is not well defined the grammar is not LL(1)

Page 50: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

50

Solution: substitution

S A a bA a |

S a a b | a b

Substitute A in S

S a after_A after_A a b | b

Left factoring

Page 51: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

51

Non LL grammar: Left-recursion

• Left recursion cannot be handled with a bounded lookahead

• What can we do?

E E - term | term

Page 52: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

52

Solution: Left recursion removal

• L(G1) = β, βα, βαα, βααα, …• L(G2) = same

N Nα | β N βN’ N’ αN’ |

G1 G2

p. 130

Can be done algorithmically.Problem: grammar becomes mangled beyond recognition

Page 53: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

53

Solution: Left recursion removal

• L(G1) = β, βα, βαα, βααα, …• L(G2) = same

N Nα | β N βN’ N’ αN’ |

G1 G2

E E - term | term

E term TE | termTE - term TE |

p. 130

Can be done algorithmically.Problem: grammar becomes mangled beyond recognition

Page 54: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

54

LL(k) Parsers

• Recursive Descent– Manual construction– Uses recursion

• Wanted– A parser that can be generated automatically– Does not use recursion

Page 55: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

55

Pushdown automata uses• Prediction stack• Input stream• Transition table

– nonterminals x tokens -> production alternative– Entries indexed by nonterminal N and token t

• Entry contains the alternative of N that must be predicated when current input starts with t

LL(k) parsing via PDA

Page 56: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

56

LL(k) parsing via PDA: Moves

• Prediction top(prediction stack) = N– Pop N– If table[N, current] = α, push α to prediction

stack, otherwise – syntax error

• Match top(prediction stack) = t– If (t == current) pop prediction stack,

otherwise syntax error

Page 57: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

57

LL(k) parsing via PDA: Termination

• Parsing terminates when prediction stack is empty– If input is empty at that point, success,

otherwise, syntax error

Page 58: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

58

( ) not true false and or xor $

E 2 3 1 1

LIT 4 5

OP 6 7 8

(1) E → LIT(2) E → ( E OP E ) (3) E → not E(4) LIT → true(5) LIT → false(6) OP → and(7) OP → or(8) OP → xor

Non

term

inal

s

Input tokens

Which rule should be used

Example transition table

Page 59: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

59

Model of non-recursivepredictive parser

Predictive Parsing program

Parsing Table

X

Y

Z

$

Stack

$ b + a

Output

Page 60: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

60

a b c

A A aAb A c

A aAb | caacbb$

Input suffix Stack content Move

aacbb$ A$ predict(A,a) = A aAbaacbb$ aAb$ match(a,a)

acbb$ Ab$ predict(A,a) = A aAbacbb$ aAbb$ match(a,a)

cbb$ Abb$ predict(A,c) = A ccbb$ cbb$ match(c,c)

bb$ bb$ match(b,b)

b$ b$ match(b,b)

$ $ match($,$) – success

Running parser example

Page 61: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

61

Erorrs

Page 62: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

62

Handling Syntax Errors

• Report and locate the error• Diagnose the error• Correct the error• Recover from the error in order to discover

more errors– without reporting too many “strange” errors

Page 63: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

63

Error Diagnosis

• Line number – may be far from the actual error

• The current token• The expected tokens• Parser configuration

Page 64: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

64

Error Recovery

• Becomes less important in interactive environments

• Example heuristics:– Search for a semi-column and ignore the statement– Try to “replace” tokens for common errors– Refrain from reporting 3 subsequent errors

• Globally optimal solutions – For every input w, find a valid program w’ with a

“minimal-distance” from w

Page 65: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

65

a b c

A A aAb A c

A aAb | cabcbb$

Input suffix Stack content Move

abcbb$ A$ predict(A,a) = A aAbabcbb$ aAb$ match(a,a)

bcbb$ Ab$ predict(A,b) = ERROR

Illegal input example

Page 66: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

66

Error handling in LL parsers

• Now what?– Predict b S anyway “missing token b inserted in line XXX”

S a c | b Sc$

a b c

S S a c S b S

Input suffix Stack content Move

c$ S$ predict(S,c) = ERROR

Page 67: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

67

Error handling in LL parsers

• Result: infinite loop

S a c | b Sc$

a b c

S S a c S b S

Input suffix Stack content Move

bc$ S$ predict(b,c) = S bSbc$ bS$ match(b,b)

c$ S$ Looks familiar?

Page 68: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

68

Error handling and recovery

• x = a * (p+q * ( -b * (r-s);

• Where should we report the error?

• The valid prefix property

Page 69: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

69

The Valid Prefix Property

• For every prefix tokens– t1, t2, …, ti that the parser identifies as legal:

• there exists tokens ti+1, ti+2, …, tn such that t1, t2, …, tn is a syntactically valid program

• If every token is considered as single character:– For every prefix word u that the parser identifies as legal

there exists w such that u.w is a valid program

Page 70: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

70

Recovery is tricky

• Heuristics for dropping tokens, skipping to semicolon, etc.

Page 71: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

71

Building the Parse Tree

Page 72: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

72

Adding semantic actions

• Can add an action to perform on each production rule

• Can build the parse tree– Every function returns an object of type Node– Every Node maintains a list of children– Function calls can add new children

Page 73: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

73

Building the parse tree

Node E() { result = new Node(); result.name = “E”; if (current {TRUE, FALSE}) // E LIT result.addChild(LIT()); else if (current == LPAREN) // E ( E OP E ) result.addChild(match(LPAREN)); result.addChild(E()); result.addChild(OP()); result.addChild(E()); result.addChild(match(RPAREN)); else if (current == NOT) // E not E result.addChild(match(NOT)); result.addChild(E()); else error; return result;}

Page 74: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

static int Parse_Expression(Expression **expr_p) {

Expression *expr = *expr_p = new_expression() ;

/* try to parse a digit */

if (Token.class == DIGIT) {

expr->type=‘D’; expr->value=Token.repr –’0’;

get_next_token();

return 1; }

/* try parse parenthesized expression */

if (Token.class == ‘(‘) {

expr->type=‘P’; get_next_token();

if (!Parse_Expression(&expr->left)) Error(“missing expression”);

if (!Parse_Operator(&expr->oper)) Error(“missing operator”);

if (Token.class != ‘)’) Error(“missing )”);

get_next_token();

return 1; }

return 0;

}

74

Parser for Fully Parenthesized Expers

Page 75: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

75

Bottom-up parsing

Page 76: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

76

Intuition: Bottom-Up Parsing

• Begin with the user's program• Guess parse (sub)trees • Check if root is the start symbol

Page 77: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

77

+ * 321

Bottom-up parsingUnambiguousgrammarE E * TE TT T + FT FF idF numF ( E )

Page 78: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

78

+ * 321

F

Bottom-up parsingUnambiguousgrammarE E * TE TT T + FT FF idF numF ( E )

Page 79: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

79

Bottom-up parsingUnambiguousgrammarE E * TE TT T + FT FF idF numF ( E )

+ * 321

F F

T

F

T

Page 80: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

80

Top-Down vs Bottom-Up• Top-down (predict match/scan-complete )

to be read…

already read…

A

Aa b

Aa b

c

aacbb$

AaAb|c

Page 81: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

81

Top-Down vs Bottom-Up• Top-down (predict match/scan-complete )

Bottom-up (shift reduce)

to be read…

already read…

A

Aa b

Aa b

c

A

a bA

c

a b

A

aacbb$

AaAb|c

Page 82: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

82

Bottom-up parsing: LR(k) Grammars

• A grammar is in the class LR(K) when it can be derived via:– Bottom-up derivation– Scanning the input from left to right (L)– Producing the rightmost derivation (R)– With lookahead of k tokens (k)

Page 83: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

83

Bottom-up parsing: LR(k) Grammars

• A language is said to be LR(k) if it has an LR(k) grammar

• The simplest case is LR(0), which we will discuss

Page 84: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

84

Terminology: Reductions & Handles

• The opposite of derivation is called reduction– Let Aα be a production rule– Derivation: βAµ βαµ– Reduction: βαµ βAµ

• A handle is the reduced substring– α is the handles for βαµ

Page 85: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

85

Goal: Reduce the Input to the Start Symbol

Example: 0 + 0 * 1B + 0 * 1E + 0 * 1E + B * 1E * 1E * BE

E → E * B | E + B | BB → 0 | 1

Go over the input so far, and upon seeing a right-hand side of a rule, “invoke” the rule and replace the right-hand side with the left-hand side (reduce)

E

BE *

B 1

0B

0

E +

Page 86: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

86

Use Shift & Reduce In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Page 87: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

87

Use Shift & Reduce In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

E

BE *

B 1

0

Stack Input action0+0*1$ shift

0 +0*1$ reduceB +0*1$ reduceE +0*1$ shiftE+ 0*1$ shiftE+0 *1$ reduceE+B *1$ reduceE *1$ shiftE* 1$ shiftE*1 $ reduceE*B $ reduceE $ accept

B

0

E +

Example: “0+0*1”

E → E * B | E + B | BB → 0 | 1

Page 88: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

88

Stack

Parser

Input

Output

Action Table

Goto table

) x * ) 7 + 23 ( (

RP Id OP RP Num OP Num LP LPtoken stream

Op(*)

Id(b)

Num(23) Num(7)

Op(+)

How does the parser know what to do?

Page 89: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

89

How does the parser know what to do?

• A state will keep the info gathered on handle(s)– A state in the “control” of the PDA– Also (part of) the stack alpha beit

• A table will tell it “what to do” based on current state and next token– The transition function of the PDA

• A stack will records the “nesting level”– Prefixes of handles

Set of LR(0) items

Page 90: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

90

LR item

N αβ

Already matched To be matched

Input

Hypothesis about αβ being a possible handle, so far we’ve matched α, expecting to see β

Page 91: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Example: LR(0) Items• All items can be obtained by placing a dot at every

position for every production:

91

(1) S E $(2) E T(3) E E + T(4) T id (5) T ( E )

1: S E$2: S E $3: S E $ 4: E T5: E T 6: E E + T7: E E + T8: E E + T9: E E + T 10: T i11: T i 12: T (E)13: T ( E)14: T (E )15: T (E)

Grammar LR(0) items

Page 92: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

92

LR(0) items

N αβ Shift Item

N αβ Reduce Item

Page 93: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

93

States and LR(0) Items

• The state will “remember” the potential derivation rules given the part that was already identified

• For example, if we have already identified E then the state will remember the two alternatives:

(1) E → E * B, (2) E → E + B• Actually, we will also remember where we are in each of

them: (1) E → E ● * B, (2) E → E ● + B• A derivation rule with a location marker is called LR(0) item

• The state is actually a set of LR(0) items. E.g., q13 = { E → E ● * B , E → E ● + B}

E → E * B | E + B | BB → 0 | 1

Page 94: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

94

Intuition

• Gather input token by token until we find a right-hand side of a rule and then replace it with the non-terminal on the left hand side– Going over a token and remembering it in the

stack is a shift• Each shift moves to a state that remembers what

we’ve seen so far – A reduce replaces a string in the stack with the

non-terminal that derives it

Page 95: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

95

Model of an LR parser

LR Parser0

T

2

+

7

id

5

Stack

$ id + id + id

Outputstate

symbol

goto action

Input

Terminals and Non-terminals

Page 96: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

96

LR parser stack

• Sequence made of state, symbol pairs• For instance a possible stack for the

grammarS E $E TE E + TT id T ( E )

could be: 0 T 2 + 7 id 5Stack grows this way

Page 97: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Form of LR parsing table

97

state terminals non-terminals

Shift/Reduce actions Goto part01...

sn

rk

shift state n reduce by rule k

gm

goto state m

acc

accept

error

Page 98: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

98

LR parser table examplegoto action STATE

T E $ ) ( + id

g6 g1 s7 s5 0

acc s3 1

2

g4 s7 s5 3

r3 r3 r3 r3 r3 4

r4 r4 r4 r4 r4 5

r2 r2 r2 r2 r2 6

g6 g8 s7 s5 7

s9 s3 8

r5 r5 r5 r5 r5 9

Page 99: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

99

Shift move

LRParsing

program

q...

Stack

$ … a …

Output

goto action

Input

• If action[q, a] = sn

Page 100: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Result of shift

100

LRParsing

program

naq...

Stack

$ … a …

Output

goto action

Input

• If action[q, a] = sn

Page 101: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

101

Reduce move

• If action[qn, a] = rk• Production: (k) A β• If β= σ1… σn

Top of stack looks like q1 σ1… qn σn• goto[q, A] = qm

LRParsing

program

qn

q…

Stack

$ … a …

Output

goto action

Input

2*|β|

Page 102: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

102

Result of reduce move

LRParsing

program

Stack

Output

goto action

2*|β|qm

A

q

$ … a …Input

• If action[qn, a] = rk• Production: (k) A β• If β= σ1… σn

Top of stack looks like q1 σ1… qn σn• goto[q, A] = qm

Last slide

Page 103: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Accept move

103

LRParsing

program

q...

Stack

$ a …

Output

goto action

Input

If action[q, a] = acceptparsing completed

Page 104: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

Error move

104

LRParsing

program

q...

Stack

$ … a …

Output

goto action

Input

If action[q, a] = error (usually empty)parsing discovered a syntactic error

Page 105: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

105

Example

Z E $E T | E + TT i | ( E )

Page 106: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

106

Example: parsing with LR itemsZ E $E T | E + TT i | ( E )

E T E E + TT i T ( E )

Z E $

i + i $

Why do we need these additional LR items?Where do they come from?What do they mean?

Page 107: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

107

-closure

• Given a set S of LR(0) items

• If P αNβ is in S• then for each rule N in the grammar

S must also contain N -closure({Z E $}) =

E T, E E + T,T i , T ( E ) }

{ Z E $,

Z E $E T | E + TT i | ( E )

Page 108: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

108

i + i $

E T E E + T

T i T ( E )

Z E $

Z E $E T | E + TT i | ( E )

Items denote possible future handles

Remember position from which we’re trying to reduce

Example: parsing with LR items

Page 109: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

109

T i Reduce item!

i + i $

E T E E + T

T i T ( E )

Z E $

Z E $E T | E + TT i | ( E )

Match items with current token

Example: parsing with LR items

Page 110: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

110

i

E T Reduce item!

T + i $Z E $E T | E + TT i | ( E )

E T E E + T

T i T ( E )

Z E $

Example: parsing with LR items

Page 111: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

111

T

E T Reduce item!

i

E + i $Z E $E T | E + TT i | ( E )

E T E E + T

T i T ( E )

Z E $

Example: parsing with LR items

Page 112: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

112

T

i

E + i $Z E $E T | E + TT i | ( E )

E T E E + T

T i T ( E )

Z E $

E E+ T

Z E$

Example: parsing with LR items

Page 113: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

113

T

i

E + i $Z E $E T | E + TT i | ( E )

E T E E + T

T i T ( E )

Z E $

E E+ T

Z E$ E E+T

T i T ( E )

Example: parsing with LR items

Page 114: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

114

E E+ T

Z E$ E E+T

T i T ( E )

E + T $

i

Z E $E T | E + TT i | ( E )

E T E E + T

T i T ( E )

Z E $

T

i

Example: parsing with LR items

Page 115: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

115

E T E E + T

T i T ( E )

Z E $

Z E $E T | E + TT i | ( E )

E + T

T

i

E E+ T

Z E$ E E+T

T i T ( E )

i

E E+T

$

Reduce item!

Example: parsing with LR items

Page 116: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

116

E T E E + T

T i T ( E )

Z E $

E $

E

T

i

+ T

Z E$

E E+ T

i

Z E $E T | E + TT i | ( E )

Example: parsing with LR items

Page 117: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

117

E T E E + T

T i T ( E )

Z E $

E $

E

T

i

+ T

Z E$

E E+ T

Z E$

i

Z E $E T | E + TT i | ( E )

Example: parsing with LR items

Reduce item!

Page 118: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

118

E T E E + T

T i T ( E )

Z E $

Z

E

T

i

+ T

Z E$

E E+ T

Z E$

Reduce item!

E $

i

Z E $E T | E + TT i | ( E )

Example: parsing with LR items

Page 119: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

119

GOTO/ACTION tables

State i + ( ) $ E T action

q0 q5 q7 q1 q6 shift

q1 q3 q2 shift

q2 ZE$q3 q5 q7 q4 Shift

q4 EE+Tq5 Tiq6 ETq7 q5 q7 q8 q6 shift

q8 q3 q9 shift

q9 TE

GOTO TableACTIONTable

empty – error move

Page 120: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

120

LR(0) parser tables

• Two types of rows:– Shift row – tells which state to GOTO for

current token– Reduce row – tells which rule to reduce

(independent of current token)• GOTO entries are blank

Page 121: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

121

LR parser data structures• Input – remainder of text to be processed• Stack – sequence of pairs N, qi

– N – symbol (terminal or non-terminal)– qi – state at which decisions are made

• Initial stack contains q0

+ i $input

q0stack i q5

Page 122: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

122

LR(0) pushdown automaton• Two moves: shift and reduce• Shift move

– Remove first token from input– Push it on the stack– Compute next state based on GOTO table– Push new state on the stack– If new state is error – report error

i + i $input

q0stack

+ i $input

q0stack

shift

i q5

State i + ( ) $ E T action

q0 q5 q7 q1 q6 shift

Stack grows this way

Page 123: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

123

LR(0) pushdown automaton• Reduce move

– Using a rule N α– Symbols in α and their following states are removed from stack– New state computed based on GOTO table (using top of stack,

before pushing N)– N is pushed on the stack– New state pushed on top of N

+ i $input

q0stack i q5

ReduceT i + i $input

q0stack T q6

State i + ( ) $ E T action

q0 q5 q7 q1 q6 shift

Stack grows this way

Page 124: Compilation 0368-3133 Lecture 4 Syntax Analysis Noam Rinetzky 1 Zijian Xu, Hong Chen, Song-Chun Zhu and Jiebo Luo, "A hierarchical compositional model.

124