Top Banner
CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code Gen Overview of the MeggyJava Assignments PA2:Lexer/scanner in MJPA2.jar Expressing tokens with regular expressions regular expression syntax for JLex using JLex with JavaCup How do lexer generators work? Convert regular expressions to NFA Converting an NFA to DFA Implementing the DFA PA2: Syntax-directed code generation (MJ.jar) CS453 Lecture Introduction 2 Structure of the MeggyJava Compiler sentences Synthesis Analysis character stream lexical analysis words tokens semantic analysis syntactic analysis AST AST and symbol table code gen Atmel assembly code PA1: Write test cases in MeggyJava, and AVR warmup PA2: MeggyJava scanner and setPixel PA3: add exps and control flow (AST) PA4: add methods (symbol table) PA5: add variables and objects PA6: add arrays and register allocation PA2 Scanner/Lexer Look at the assignment writeup and point out the tar ball. Look at the input files. Look at the output files. Look at MJPA2Driver.java. Look at mj.lex. Look at the Makefile. CS453 Lecture Lexical Analysis with JLex 3 CS453 Lecture Lexical Analysis with JLex 4 Specifying Tokens with JLex JLex example input file: package mjparser; import java_cup.runtime.Symbol; %% %line %char %cup %public %eofval{ return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar)); %eofval} LETTER=[A-Za-z] DIGIT=[0-9] UNDERSCORE="_" LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE} ID={LETTER}({LETT_DIG_UND})* ... %% "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); } "+" {return new Symbol(sym.PLUS, ...); } "if" {return new Symbol(sym.IF,...); } {ID} {return new Symbol(sym.ID, new ... {EOL} { /* reset yychar */ … } {WS} { /* ignore */ }
9

Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

Apr 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

CS453 Lecture Lexical Analysis with JLex 1

Plan for Lexical Analysis with Jlex and One Pass Code Gen

 Overview of the MeggyJava Assignments

 PA2:Lexer/scanner in MJPA2.jar

 Expressing tokens with regular expressions –  regular expression syntax for JLex –  using JLex with JavaCup

 How do lexer generators work? –  Convert regular expressions to NFA –  Converting an NFA to DFA –  Implementing the DFA

 PA2: Syntax-directed code generation (MJ.jar) CS453 Lecture Introduction 2

Structure of the MeggyJava Compiler

�sentences�

Synthesis Analysis

character stream

lexical analysis

�words� tokens

semantic analysis

syntactic analysis

AST

AST and symbol table

code gen

Atmel assembly code

PA1: Write test cases in MeggyJava, and AVR warmup PA2: MeggyJava scanner and setPixel PA3: add exps and control flow (AST) PA4: add methods (symbol table) PA5: add variables and objects PA6: add arrays and register allocation

PA2 Scanner/Lexer

 Look at the assignment writeup and point out the tar ball.

 Look at the input files.

 Look at the output files.

 Look at MJPA2Driver.java.

 Look at mj.lex.

 Look at the Makefile.

CS453 Lecture Lexical Analysis with JLex 3 CS453 Lecture Lexical Analysis with JLex 4

Specifying Tokens with JLex  JLex example input file:

 package mjparser;  import java_cup.runtime.Symbol;

 %%  %line  %char  %cup  %public

 %eofval{   return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar));  %eofval}

 LETTER=[A-Za-z]  DIGIT=[0-9]  UNDERSCORE="_"  LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE}  ID={LETTER}({LETT_DIG_UND})*  ...

 %%  "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); }

 "+" {return new Symbol(sym.PLUS, ...); }  "if" {return new Symbol(sym.IF,...); }

 {ID} {return new Symbol(sym.ID, new ...

 {EOL} { /* reset yychar */ … }  {WS} { /* ignore */ }

Page 2: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

1q 2q

3q

a

a

a

0q

}{aAlphabet =

Nondeterministic Finite Acceptor (NFA)

1q 2q

3q

a

a

a

0q

Two choices

}{aAlphabet =

Nondeterministic Finite Accepter (NFA)

a a

0q

1q 2q

3q

a

a

First Choice

a

a a

0q

1q 2q

3q

a

a

a

First Choice

Page 3: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

a a

0q

1q 2q

3q

a

a

First Choice

a

a a

0q

1q 2q

3q

a

a

a “accept”

First Choice

All input is consumed

a a

0q

1q 2q

3q

a

a

Second Choice

a

a a

0q

1q 2qa

a

Second Choice

a

3q

Page 4: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

a a

0q

1q 2qa

a

a

3q

Second Choice

No transition: the automaton hangs

a a

0q

1q 2qa

a

a

3q

Second Choice

should we reject aa?

Input cannot be consumed

An NFA accepts a string: when there is a computation of the NFA that accepts the string

all the input is consumed and the automaton is in a final state

AND

When To Accept a String Example

aa is accepted by the NFA:

0q

1q 2q

3q

a

a

a

“accept”

0q

1q 2qa

a

a

3q

“reject??” because this computation accepts aa

But this only tells us that choice didn’t work….

Page 5: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

a

0q

1q 2q

3q

a

a

Rejection example

a

a

0q

1q 2q

3q

a

a

a

First Choice

a

0q

1q 2q

3q

a

a

a

First Choice

“reject??”

Second Choice

a

0q

1q 2q

3q

a

a

a

Page 6: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

Second Choice

a

0q

1q 2qa

a

a

3q

Second Choice

a

0q

1q 2qa

a

a

3q “reject??”

An NFA rejects a string: when there is NO computation of the NFA that accepts the string:

•  All the input is consumed and the automaton is in a non final state

•  The input cannot be consumed

OR

Example

a is rejected by the NFA:

0q

1q 2qa

a

a

3q “reject??” 0q

1q 2qa

a

a

3q

“reject??”

All possible computations lead to rejection

Page 7: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

1q 2q

3q

a

a

a

0q

Language accepted: }{aaL =

CS453 Lecture Lexical Analysis with JLex 26

Specifying Tokens with JLex  JLex example input file:

 package mjparser;  import java_cup.runtime.Symbol;

 %%  %line  %char  %cup  %public

 %eofval{   return new Symbol(sym.EOF, new TokenValue("EOF", yyline, yychar));  %eofval}

 LETTER=[A-Za-z]  DIGIT=[0-9]  UNDERSCORE="_”  EOL=(\n|\r|\r\n)  LETT_DIG_UND={LETTER}|{DIGIT}|{UNDERSCORE}  ID={LETTER}({LETT_DIG_UND})*

 %%  "&&" {return new Symbol(sym.AND, new TokenValue(yytext(), yyline, yychar)); }

 "+" {return new Symbol(sym.PLUS, ...); }  "if" {return new Symbol(sym.IF,...); }

 {ID} {return new Symbol(sym.ID, new ...

 {EOL} { /* reset yychar */ … }  {WS} { /* ignore */ }

CS453 Lecture Lexical Analysis with JLex 27

Example NFA for Multiple Tokens

CS453 Lecture Lexical Analysis with JLex 28

DFA from IF and ID NFAs (Do in class)

Page 8: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

CS453 Lecture Lexical Analysis with JLex 29

DFA from IF and ID NFAs (Answer) Implementing DFAs?

CS453 Lecture Lexical Analysis with JLex 30

PA2 Syntax Directed Code Generation

 Look at the assignment writeup and point out usage of MJ.jar.

 Input files are MeggyJava files that fit the PA2 grammar.

 Look at current output file. Will be a .s file that can go through the simulator.

 Look at MJDriver.java.

 Look at mj.cup.

 Look at the Makefile.

CS453 Lecture Lexical Analysis with JLex 31 CS453 Lecture Context Free Grammar Intro 32

Recall Doing Syntax-Directed Interpretation

42 + 7 * 6

(1) exp --> exp * exp (2) exp --> exp + exp (3) exp --> NUM

Grammar

String

Page 9: Plan for Lexical Analysis with Jlex and One Pass Code Gen ...cs453/yr2013/Slides/...CS453 Lecture Lexical Analysis with JLex 1 Plan for Lexical Analysis with Jlex and One Pass Code

CS453 Lecture Context Free Grammar Intro 33

Semantic Rules for Expression Example Code Generation versus Interpretation

 When interpreting . . . –  Each action in the .cup file associates a value with the left hand side of the

non terminal. –  Each non terminal on the right hand side has a value associated with it. –  This approach will also be useful when we are building the Abstract

Syntax Tree (AST) in PA3.

 When doing one pass compilation . . . –  Actions output the target code (in this case AVR assembly)

CS453 Lecture Lexical Analysis with JLex 34

Parse Tree for An Empty MeggyJava Program

CS453 Lecture Lexical Analysis with JLex 35