1 Week 3 • Questions / Concerns • What’s due: • Lab1b due Friday at midnight • Lab1b check-off next week (schedule will be announced on Monday) • Homework #2 due next Monday (Draw a parse tree) • Homework #3 due next Wednesday (Define grammar for your language) • Homework #4 due next Thursday (Grammar modifications) • Top down parser • Grammar modifications
33
Embed
1 Week 3 Questions / Concerns What’s due: Lab1b due Friday at midnight Lab1b check-off next week (schedule will be announced on Monday) Homework #2 due.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Week 3
• Questions / Concerns• What’s due:
• Lab1b due Friday at midnight• Lab1b check-off next week (schedule will be announced on Monday)• Homework #2 due next Monday (Draw a parse tree)• Homework #3 due next Wednesday (Define grammar for your
language)• Homework #4 due next Thursday (Grammar modifications)
• Top down parser• Grammar modifications
2
Structure of Compilers
Lexical Analyzer (scanner)
Modified Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
skeletal source
programpreprocessor
3
Parser
• Choose a type of parser• Top-Down parser• Bottom-Up parser
• Choose a parsing technique• Recursive Descent • Table driven parser (LL(1) or LR(1))
• Generate a grammar for your language• Modify the grammar to fit the particular parsing technique
• Remove lambda productions• Remove unit productions• Remove left recursion• Left factor the grammar
4
Parser
• Parser is just a matching tool• It matches list of tokens with grammar rules to determine if they are
legal constructs/statements or not.• Yes/No machine• Context-Free
• It doesn’t care about context (types), it just cares about syntax• If it looks like an assignment statement, then it is an assignment
• Start with start symbol of the grammar.• Grab an input token and select a production rule.
• Use “stack” to store the production rule.
• Try to parse that rule by matching input tokens. • Keep going until all of the input tokens have been
processed. • If the rule is not the right one, put all the tokens back and
try a different rule. (backtracking)
11
Top-down Parser
• Ideal grammar:• Unique rule for each type of token.
• One-token look ahead
12
One token look ahead
Stat ->
local function Name Funcbody | local Namelist LocalOptional
•Based on one token “local” we should be able to pick one unique rule so we don’t have to backtrack. •What if we could combine these 2 rules into one rule by factoring out the common parts, it would eliminate the need for backtracking.
13
One token look ahead
Stat ->
local function Name Funcbody | local Namelist LocalOptional
•Left factor the grammar:
Stat -> local Morelocal
Morelocal -> function Name Funcbody | Namelist LocalOptional
14
Top-down Parser
• Ideal grammar:• Unique rule for each type of token.
• One-token look ahead
• Minimize unit productions • Unit productions don’t parse tokens immediately. It requires another
production. • It’s hard to tell which tokens match the unit productions thus more
chances for backtracking.
15
Minimize Unit Productions
S -> aaSc
S -> B
B -> bbbB
B ->
S
B
b b b B
16
Minimize Unit Productions
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Functioncall |
Prefixexp |
Tableconstructor |
Exp Binop Exp |
Unop Exp
17
Remove Unit Productions
S -> aaSc
S -> B
B -> bbbB
B ->
S -> aaSc
S -> bbbB
S ->
B -> bbbB
B ->
18
Minimize Unit Productions
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Functioncall |
Prefixexp |
Tableconstructor |
Exp Binop Exp |
Unop Exp
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Functioncall|
Prefixexp |
{ Fieldlistoptional }|
Exp Binop Exp |
Unop Exp
19
Minimize Unit Productions
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Functioncall |
Prefixexp |
Tableconstructor |
Exp Binop Exp |
Unop Exp
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Prefixexp Args |
Prefixexp `:´ Name Args |
Prefixexp |
{ Fieldlistoptional } |
Exp Binop Exp |
Unop Exp
20
Minimize Unit Productions
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Functioncall |
Prefixexp |
Tableconstructor |
Exp Binop Exp |
Unop Exp
Exp -> nil |
false |
true |
Number |
String |
`...´ |
Prefixexp Args |
Prefixexp `:´ Name Args |
Prefixexp |
{ Fieldlistoptional } |
Exp Binop Exp |
Unop Exp More left factoring needed
21
Top-down Parser
• Ideal grammar:• Unique rule for each type of token.
• One-token look ahead
• Minimize unit productions • Unit productions don’t parse tokens immediately. It requires another
production. • It’s hard to tell which tokens match the unit productions thus more
chances for backtracking.
• Lambda productions are okay but we have to process them accordingly. • Removing lambdas always add more rules. • It’s not possible to remove all lambda productions and still yield unique
token-rule matching.
• Remove left recursion in the grammar.
22
Grammar (left recursive vs. right recursive)
Right Recursion
A -> aA
A ->
Left Recursion
A -> Aa
A ->
A
a A
a A
a A
A
aA
aA
aA
Only non-recursive rule is
Same grammar?
23
Grammar (left recursive vs. right recursive)
A -> aA
A -> A -> Aa
A ->
A
a A
a A
a A
A
aA
aA
aA
Which one works for top down?
24
Grammar (left recursive vs. right recursive)
A -> aA
A -> b
A -> Aa
A -> b
A
a A
a A
a A
b
A
aA
aA
aA
b
Non-recursive rules are not only
Same grammar?
25
Remove Left Recursion in the Grammar
• Example:
A -> Aa
A -> b• Step 1: Make all left recursive rules right recursive, but give them a new non-
terminal
A -> Aa X -> aX
• Step 2: Add a lambda production to the new non-terminal X ->
• Step 3: Identify all non-recursive rules.
A -> b
• Step 4: Append the new non-terminal to the end of all non-recursive rules• A -> bX
•
A -> A… Left Recursive rule
26
Grammar (left recursive vs. right recursive)
A -> bX
X -> aX | A -> Aa
A -> b
A
b X
a X
a X
A
aA
aA
aA
b
Non-recursive rules are not only
Same grammar?
a
27
Remove Left Recursion
S -> Sab
S -> c
S -> d
X -> abX
X -> S -> cX
S -> dX
28
Remove Left Recursion
PARAMLIST -> IDLIST : TYPE |
PARAMLIST ; IDLIST : TYPE
PARAMLIST2 -> ; IDLIST : TYPE PARAMLIST2
PARAMLIST2 -> PARAMLIST -> IDLIST : TYPE PARAMLIST2