Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code
Post on 21-Apr-2020
6 Views
Preview:
Transcript
CS453 Lecture Lexical Analysis with JLex 1
Plan for Lexical Analysis with Jlex and One Pass Code Gen
Overview of the MeggyJava Assignments
PA2:Lexer/scanner in MJPA2.jar
Expressing tokens with regular expressions – regular expression syntax for JLex – using JLex with JavaCup
How do lexer generators work? – Convert regular expressions to NFA – Converting an NFA to DFA – Implementing the DFA
PA2: Syntax-directed code generation (MJ.jar) CS453 Lecture Introduction 2
Structure of the MeggyJava Compiler
�sentences�
Synthesis Analysis
character stream
lexical analysis
�words� tokens
semantic analysis
syntactic analysis
AST
AST and symbol table
code gen
Atmel assembly code
PA1: Write test cases in MeggyJava, and AVR warmup PA2: MeggyJava scanner and setPixel PA3: add exps and control flow (AST) PA4: add methods (symbol table) PA5: add variables and objects PA6: add arrays and register allocation
PA2 Scanner/Lexer
Look at the assignment writeup and point out the tar ball.
Look at the input files.
Look at the output files.
Look at MJPA2Driver.java.
Look at mj.lex.
Look at the Makefile.
CS453 Lecture Lexical Analysis with JLex 3 CS453 Lecture Lexical Analysis with JLex 4
Specifying Tokens with JLex JLex example input file:
package mjparser; import java_cup.runtime.Symbol;
%% %line %char %cup %public
%eofval{ return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar)); %eofval}
LETTER=[A-Za-z] DIGIT=[0-9] UNDERSCORE="_" LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} ID={LETTER}({LETT_DIG_UND})* ...
%% "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); }
"+" {return new Symbol(sym.PLUS, ...); } "if" {return new Symbol(sym.IF,...); }
{ID} {return new Symbol(sym.ID, new ...
{EOL} { /* reset yychar */ … } {WS} { /* ignore */ }
1q 2q
3q
a
a
a
0q
}{aAlphabet =
Nondeterministic Finite Acceptor (NFA)
1q 2q
3q
a
a
a
0q
Two choices
}{aAlphabet =
Nondeterministic Finite Accepter (NFA)
a a
0q
1q 2q
3q
a
a
First Choice
a
a a
0q
1q 2q
3q
a
a
a
First Choice
a a
0q
1q 2q
3q
a
a
First Choice
a
a a
0q
1q 2q
3q
a
a
a “accept”
First Choice
All input is consumed
a a
0q
1q 2q
3q
a
a
Second Choice
a
a a
0q
1q 2qa
a
Second Choice
a
3q
a a
0q
1q 2qa
a
a
3q
Second Choice
No transition: the automaton hangs
a a
0q
1q 2qa
a
a
3q
Second Choice
should we reject aa?
Input cannot be consumed
An NFA accepts a string: when there is a computation of the NFA that accepts the string
all the input is consumed and the automaton is in a final state
AND
When To Accept a String Example
aa is accepted by the NFA:
0q
1q 2q
3q
a
a
a
“accept”
0q
1q 2qa
a
a
3q
“reject??” because this computation accepts aa
But this only tells us that choice didn’t work….
a
0q
1q 2q
3q
a
a
Rejection example
a
a
0q
1q 2q
3q
a
a
a
First Choice
a
0q
1q 2q
3q
a
a
a
First Choice
“reject??”
Second Choice
a
0q
1q 2q
3q
a
a
a
Second Choice
a
0q
1q 2qa
a
a
3q
Second Choice
a
0q
1q 2qa
a
a
3q “reject??”
An NFA rejects a string: when there is NO computation of the NFA that accepts the string:
• All the input is consumed and the automaton is in a non final state
• The input cannot be consumed
OR
Example
a is rejected by the NFA:
0q
1q 2qa
a
a
3q “reject??” 0q
1q 2qa
a
a
3q
“reject??”
All possible computations lead to rejection
1q 2q
3q
a
a
a
0q
Language accepted: }{aaL =
CS453 Lecture Lexical Analysis with JLex 26
Specifying Tokens with JLex JLex example input file:
package mjparser; import java_cup.runtime.Symbol;
%% %line %char %cup %public
%eofval{ return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar)); %eofval}
LETTER=[A-Za-z] DIGIT=[0-9] UNDERSCORE="_” EOL=(\n|\r|\r\n) LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} ID={LETTER}({LETT_DIG_UND})*
%% "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); }
"+" {return new Symbol(sym.PLUS, ...); } "if" {return new Symbol(sym.IF,...); }
{ID} {return new Symbol(sym.ID, new ...
{EOL} { /* reset yychar */ … } {WS} { /* ignore */ }
CS453 Lecture Lexical Analysis with JLex 27
Example NFA for Multiple Tokens
CS453 Lecture Lexical Analysis with JLex 28
DFA from IF and ID NFAs (Do in class)
CS453 Lecture Lexical Analysis with JLex 29
DFA from IF and ID NFAs (Answer) Implementing DFAs?
CS453 Lecture Lexical Analysis with JLex 30
PA2 Syntax Directed Code Generation
Look at the assignment writeup and point out usage of MJ.jar.
Input files are MeggyJava files that fit the PA2 grammar.
Look at current output file. Will be a .s file that can go through the simulator.
Look at MJDriver.java.
Look at mj.cup.
Look at the Makefile.
CS453 Lecture Lexical Analysis with JLex 31 CS453 Lecture Context Free Grammar Intro 32
Recall Doing Syntax-Directed Interpretation
42 + 7 * 6
(1) exp --> exp * exp (2) exp --> exp + exp (3) exp --> NUM
Grammar
String
CS453 Lecture Context Free Grammar Intro 33
Semantic Rules for Expression Example Code Generation versus Interpretation
When interpreting . . . – Each action in the .cup file associates a value with the left hand side of the
non terminal. – Each non terminal on the right hand side has a value associated with it. – This approach will also be useful when we are building the Abstract
Syntax Tree (AST) in PA3.
When doing one pass compilation . . . – Actions output the target code (in this case AVR assembly)
CS453 Lecture Lexical Analysis with JLex 34
Parse Tree for An Empty MeggyJava Program
CS453 Lecture Lexical Analysis with JLex 35
top related