Top Banner
1 Compilers Mick O’Donnell: [email protected] Topic 3: Syntactic analysis (LR) Topic 3: Syntactic analysis (LR) 3.1 Introduction to parsing
34

03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

May 18, 2018

Download

Documents

vanngoc
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

1

Compilers

Mick O’Donnell: [email protected]

Topic 3: Syntactic analysis (LR)

Topic 3: Syntactic analysis (LR)

3.1 Introduction to parsing

Page 2: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

2

3

3.1 Introduction to parsing

• Main function of parser:

• Produce a parse tree which is then used by Code Generator to produce

target code

• Secondary function of parser:

• Syntactic error detection – report to user where any error in the source

code are.

• The parser needs to be designed to match both these functions

• The design of the parser could be simpler if only compilation was needed:

• If debugging not an issue, parser could stop at first instance of

malformed input

• However, to optimise the Code/Compile/Debug cycle, the compiler

should not stop on the first detected syntax error, but rather, produce a

listing of all errors

4

Topics in Parsing

1. Notion of grammar

• Terminals

• Nonterminals

• Start Symbol

2. Grammar Rules

• A -> B C

• A -> B | C

3. Applying a grammar (building a parse tree)

• Grammar: E -> E + E | E * E | -E | (E) | id

• Example: a + b

• Example a * (b + c)

• good parse tree has start symbol at top

3.1 Introduction to parsing

Page 3: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

3

5

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

3. Applying a grammar (cont.)

3.1 Introduction to parsing

6

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

3.1 Introduction to parsing

Page 4: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

4

7

3.1 Introduction to parsing

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

E

8

3.1 Introduction to parsing

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

E

Page 5: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

5

9

3.1 Introduction to parsing

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

E E

10

3.1 Introduction to parsing

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

E E

Page 6: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

6

11

3.1 Introduction to parsing

• Grammar:

• E -> E + E

• E -> E * E

• E -> -E

• E -> (E)

• E -> id

• Example 1: a + b

a + b

E E

E • All input tokens incorporated

•Top token is start token

•Thus, a good parse

12

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

Page 7: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

7

13

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E

14

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

Page 8: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

8

15

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

E

16

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

E

E

Page 9: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

9

17

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

a + b * c

E E

E

E

E

18

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E

Page 10: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

10

19

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

EE

20

3.1 Introduction to parsing

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

E

Page 11: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

11

21

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

E

E

22

4. Order of application is important

• Parse example: a + b * c

• Left-right application

• Right to left application

a + b * c

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

E E

E

E

E

Page 12: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

12

23

3.1 Introduction to parsing

4. Order of application is important

• L-R and R-L application give

different trees

• Semantics may differ

a + b * c

E E

E

E

Ea + b * c

E E

E

E

E

24

3.1 Introduction to parsing

5.Rewrite rules and Derivations (IMPORTANT)

• Concept of derivation: being sequence of rewrites from start symbol to

surface structure

• Parse tree is a graphical representation of a derivation sequence

6.Handling Ambiguous Parses (e.g., a + b * c )

• Two approaches:

• Add disambiguating rules that throw away undesirable parse trees,

leaving just one

• Rewrite grammar to be unambiguous

Page 13: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

13

25

3.1 Introduction to parsing

6. Handling Ambiguous Parses (cont. )

• The dangling else probem:

stmt -> if expr then stmt

stmt -> if expr then stmt else stmt

But consider the code:

if x==1 then if y==2 print 1 else print 2

if x==1 then if y==2 print 1 else print 2

if x==1 then if y==2 print 1 else print 2OR

26

3.1 Introduction to parsing

6. Handling Ambiguous Parses (cont. )

• Rewriting the grammar to remove ambiguity:

stmt -> if expr then stmt

stmt -> if expr then stmt else stmt

stmt -> matched_stmt | unmatched_stmt | other_stmt

matched_stmt -> if expr then matched_stmt else matched_stmt

| other_stmt

unmatched_stmt -> if expr then stmt

| if expr then matched_stmt else unmatched_stmt

Page 14: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

14

27

3.1 Introduction to parsing

7.Top-down vs. Bottom-up analysis

• Previously, we have been building trees by bottom-up application of

rules

• bottom-up parsing is a parsing method that works by identifying

terminal symbols first, and combines them successively to produce

nonterminals

• We build structure from the bottom up.

• Other approaches build structure from the top-down:

• We start with the START symbol

• We apply expansions of non-terminal symbols

28

3.1 Introduction to parsing

7. Top-down vs. Bottom-up analysis

• Top down analysis starts with the START symbol and expands it

a + b * c

E

Page 15: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

15

29

7. Top-down vs. Bottom-up analysis

a + b * c

E

E

E Apply: E-> E+E

30

7. Top-down vs. Bottom-up analysis

a + b * c

E

E

E Apply: E-> id

Page 16: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

16

31

7. Top-down vs. Bottom-up analysis

a + b * c

E E

E

E

E Apply: E-> E*E

32

7. Top-down vs. Bottom-up analysis

a + b * c

E E

E

E

E Apply: E-> id

Page 17: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

17

33

7. Top-down vs. Bottom-up analysis

a + b * c

E E

E

E

E Apply: E-> id

34

3.1 Introduction to parsing

10.Types of parsers

• Top Down:

• LL parsers

• Bottom Up

• LR parsers

• LR(0)

• SLR(1) (Simple LR)

• LR(1) (Canonical LR)

• LALR (LookAhead LR)

• LR(k)

• Operator Precedence Parsers

Page 18: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

18

35

Derivation Sequences

Derivations

• A ‘derivation’ displays the sequence of substitutions from the

START symbol to the input.

• A ‘leftmost derivation’ is one which, working from top to bottom,

the leftmost nonterminal is the one to expand

E

E * E

( E ) * E

( E + E ) * E

( id + E ) * E

( id + id ) * E

( id + id ) * id

36

Derivation Sequences

Rightmost Derivations

• A ‘rightmost derivation’ is one which, working from top to bottom,

the rightmost nonterminal is the one expanded

E

E * E

E * id

( E ) * id

( E + E ) * id

( E + id ) * id

( id + id ) * id

Page 19: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

19

37

LR analysis

LR Parsing and Derivations

LR Parsers and Derivations

• The main family of Bottom-Up parsers

• An LR parser is one which:

• Left to right processing of input

• Rightmost derivation

• Now, this last seems to mean it attempts to apply rules on the right

side of the input.

• But in fact, this is not so: rules are applied to the left side of the

structure first.

• But when we look at the derivation tree produced, it represents a

rightmost derivation when viewed top-down.

38

LR analysis

Derivation Sequences

• NOTE: Leftmost application of rules in BU processing produces a

rightmost derivation

E

E * E

E * id

( E ) * id

( E + E ) * id

( E + id ) * id

( id + id ) * id

Page 20: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

20

Topic 3: Syntactic analysis

3.2 Shift-Reduce Parsers

Mick O’Donnell: [email protected]

40

LR analysis

Shift-Reduce Parsing

Shift-Reduce Parsing

• A bottom-up parsing technique

• Used in LR parsers

• Two basic concepts:

• Shift

• Reduce

Page 21: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

21

41

LR analysis

Shift-Reduce Parsing

Reducing

• The main concept here is that we have a stack (‘pila’) which holds

the tokens we have so far recognised.

• Where the tokens on the top of the stack match the RHS of a rule,

we can replace those tokens with the LHS of the rule (this

operation is called ‘reduce’).

Stack: ( E + E reduce Stack: ( E

E -> E + E

42

LR analysis

Shift-reduce Parsing

Shifting

• When we cannot reduce the stack, we shift in another input token:

shiftStack: ( E

Input: ( id + id) * id

Stack: ( E )

Page 22: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

22

43

LR analysis

Shift-Reduce Cycle

• We start with an empty stack

• We shift in the first token

• We then start a cycle:

1. If the top of stack matches a RHS

• Reduce

2. Else:

• Shift

3. Goto (1)

• We terminate when the input is exhausted.

Shift-Reduce Parsing

44

Input: ( id + id ) * id

Stack:

Pointer at next

input token:

Stack of matched

terminals or reductions

of terminals

List of input

tokens

Page 23: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

23

45

Input: ( id + id ) * id

Stack:

Action: ShiftShift: move next

token to the stack

and advance pointer

46

Input: ( id + id ) * id

Stack: (

Action: ShiftShift: move next

token to the stack

and advance pointer

Page 24: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

24

47

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

Query: Can reduce?

( If the top elements in

the stack match the

RHS of a rule, replace

with its LHS

48

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id

Query: Can reduce?

Response: No.

(

Page 25: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

25

49

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id(

Query: Can reduce?

Response: No.

Action: Shift

50

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( id

Page 26: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

26

51

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( id

Query: Can reduce?

Response: Yes.

Action: Reduce

We can reduce when

the top elements in the

stack match the RHS

of a rule

52

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E

No change of input

pointer

id replaced by E

Page 27: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

27

53

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E

Query: Can reduce?

Response: No.

Action: Shift

54

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E +

Page 28: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

28

55

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E +

Query: Can reduce?

Response: No.

Action: Shift

56

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E + id

Query: Can reduce?

Response: Yes.

Action: Reduce

Page 29: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

29

57

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E + E

58

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E + E

Query: Can reduce?

Response: Yes.

Action: Reduce

Page 30: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

30

59

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E

Query: Can reduce?

Response: No.

Action: Shift

60

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> id( E )

Query: Can reduce?

Response: Yes.

Action: Reduce

Page 31: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

31

61

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> idE

62

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> idE *

Page 32: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

32

63

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> idE * id

64

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> idE * E

Page 33: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

33

65

Input: ( id + id ) * id

Stack:

Grammar:

E -> E + E

E -> E * E

E -> -E

E -> (E)

E -> idE

66

Input: ( id + id ) * id

Stack: E

At this point, input is

exhausted, and stack

contains START symbol.

A successful parse

Query: Input exhausted?

Response: Yes.

Query: Stack == Start Symb?

Action: Accept

Page 34: 03 1 Parsing Intro - UAMarantxa.ii.uam.es/~modonnel/Compilers/03_1_Parsing_Intro.pdf · • The design of the parser could be simpler if only compilation was needed:

34

67

The four actions of a Shift-Reduce parser are:

• Shift – move next input onto stack and advance input pointer

• Reduce – replace symbols on top of stack with rule LHS

• Accept - the parser announces successful completion of

parsing;

• Error - the parser discovers that a syntax error has occurred.

Shift-Reduce Parser: Summary