CS 153: Concepts of Compiler Design Sept. 15 Class Meeting Department of Computer Science San Jose State University Fall 2009 Instructor: Ron Mak mak.

Post on 26-Mar-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

CS 153: Concepts of Compiler DesignSept. 15 Class Meeting

Department of Computer ScienceSan Jose State University

Fall 2009Instructor: Ron Mak

www.cs.sjsu.edu/~mak

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

2

Unofficial Field Trip [Tentative]

Computer History Museum in Mt. View. http://www.computerhistory.org/

Saturday: October 24 at 2:00 pm. (~2 hours) See (and hear) a fully restored IBM 1401 mainframe

computer from the early 1960s in operation. My summer seminar: http://www.cs.sjsu.edu/~mak/1401/ http://en.wikipedia.org/wiki/IBM_1401,

http://ed-thelen.org/1401Project/1401RestorationPage.html See a life-size working model of Charles Babbage’s

Difference Engine in operation, a hand-cranked mechanical computer designed in the early 1800s. http://en.wikipedia.org/wiki/Difference_engine

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

3

Unofficial Field Trip (cont’d) IBM 1401 computer, restored and operational

A small transistor-based mainframe computer. Extremely popular with small businesses in the late 1950s

through the mid 1960s Maximum of 16K bytes of memory. 800 card/minute card reader (wire brushes). 600 line/minute line printer (impact). 6 magnetic tape drives, no disk

Babbage Difference Engine, fully operational Hand-cranked mechanical computer for computing polynomial

functions. Designed by Charles Babbage in the early to mid 1800s.

Arguably the world’s first computer scientist, lived 1791-1871. He wasn’t able to build it because he lost his funding. His plans survived and this working model was built.

Includes a working printer!

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

4

Subversion checkout Command

To create and load your initial local workspace directory CS153/teamname from the repository:

mkdir CS153cd CS153svn checkout --username nnnn \ https://sjsu-cs.svn.cvsdude.com/teamname

Where: nnnnn is your user name (e.g., tjones) and teamname is your team name (e.g., code_monkey)

You will be prompted for your password (e.g., jones85). You should only need to do a checkout once to initially

populate your workspace.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

5

How to Get Automatic Emails

CVSDude demo:

Log in at https://cvsdude.com/ajax#login Click the My Settings tab (upper right corner) Click the Email Notification Preferences EDIT button. Check all three Basic Options Click the SPECIFY PATHS button. Check your project.

Now you should get an automatic email every time a team member commits a change to the repository.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

6

Pascal Control Statements

Looping statements REPEAT UNTIL WHILE DO FOR TO FOR DOWNTO

Conditional statements IF THEN IF THEN ELSE CASE

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

7

Statement Syntax Diagram

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

8

Pascal Statement Parsers

New statement parser subclasses. RepeatStatementParser WhileStatementParser ForStatementParser IfStatementParser CaseStatementParser

Each parse() method builds a parse subtree and returns the root node.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

9

REPEAT Statement

Example: REPEAT

j := i; k := i UNTIL i <= j

Keep looping until the boolean expression becomes true. Execute the loop at least once.

Use LOOP and TEST nodes forsource language independence.

Exit the loopwhen the testexpressionevaluates to true.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

10

Pascal Syntax Checker II: REPEAT

Demo. java -classpath classes Pascal compile -i repeat.txt

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

11

Syntax Error Handling

Recall that syntax error handling in the front end is a three-step process.1. Detect the error.

2. Flag the error.

3. Recover from the error.

Good syntax error handling is important!

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

12

Options for Error Recovery Stop after the first error.

No error recovery at all. Easiest for the compiler writer, annoying for the programmer. Worse case: The compiler crashes or hangs.

Become hopelessly lost. Attempt to continue parsing the rest of the source program. Spew out lots of irrelevant and meaningless error messages. No error recovery here, either …

… but the compiler writer doesn’t admit it!

Skip tokens after the erroneous token until … The parser finds a token it recognizes, and It can safely resume syntax checking the rest of the source

program.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

13

Parser Synchronization

Skipping tokens to reach a safe, recognizable place to resume parsing is known as synchronizing. “Resynchronize the parser” after an error.

Good error recovery with top-down parsers is more art than science. How many tokens should the parser skip?

Skipping too many (the rest of the program?) can be considered “panic mode” recovery.

For this class, we’ll take a rather simplistic approach to synchronization.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

14

public Token synchronize(EnumSet syncSet) throws Exception { Token token = currentToken();

if (!syncSet.contains(token.getType())) { errorHandler.flag(token, UNEXPECTED_TOKEN, this);

do { token = nextToken(); } while (!(token instanceof EofToken) && !syncSet.contains(token.getType())); }

return token; }

The synchronize() Method

The synchronize() method of class PascalParserTD. Pass it an enumeration set of “good” token types. The method skips tokens until it finds one that is in the set.

Flag the firstbad token.

Recover by skipping tokens not in thesynchronization set.

Resume parsing at this token!(It’s the first token after the error that is in the synchronization set.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

15

WHILE Statement

Example: WHILE i > j DO k := i

Exit the loopwhen the testexpressionevaluates to false.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

16

Class WhileStatementParser

From parent class StatementParser:

// Synchronization set for DO.private static final EnumSet<PascalTokenType> DO_SET = StatementParser.STMT_START_SET.clone();static { DO_SET.add(DO); DO_SET.addAll(StatementParser.STMT_FOLLOW_SET);}

// Synchronization set for starting a statement.protected static final EnumSet<PascalTokenType> STMT_START_SET = EnumSet.of(BEGIN, CASE, FOR, PascalTokenType.IF, REPEAT, WHILE, IDENTIFIER, SEMICOLON);

// Synchronization set for following a statement.protected static final EnumSet<PascalTokenType> STMT_FOLLOW_SET = EnumSet.of(SEMICOLON, END, ELSE, UNTIL, DOT);

In class WhileStatementParser:DO_SET containsall the tokensthat can start a statement or follow a statement, plus the DO token.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

17

Class WhileStatementParser (cont’d) public ICodeNode parse(Token token) throws Exception { token = nextToken(); // consume the WHILE

ICodeNode loopNode = ICodeFactory.createICodeNode(LOOP); ICodeNode testNode = ICodeFactory.createICodeNode(TEST); ICodeNode notNode = ICodeFactory.createICodeNode(ICodeNodeTypeImpl.NOT);

loopNode.addChild(testNode); testNode.addChild(notNode);

ExpressionParser expressionParser = new ExpressionParser(this); notNode.addChild(expressionParser.parse(token));

token = synchronize(DO_SET); if (token.getType() == DO) { token = nextToken(); // consume the DO } else { errorHandler.flag(token, MISSING_DO, this); }

StatementParser statementParser = new StatementParser(this); loopNode.addChild(statementParser.parse(token));

return loopNode; }

Synchronize the parser here!If the current token is not DO,then skip tokens until we finda token that is in DO_SET.

We’re in this method because theparser has already seen WHILE.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

18

Pascal Syntax Checker II: WHILE

We can recover (better) from syntax errors.

Demo. java -classpath classes Pascal compile -i while.txt java -classpath classes Pascal compile -i whileerrors.txt

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

19

FOR Statement

Example: FOR k := j TO 5 DO n := k

Initial assignment.

Node type GTfor TO and LTfor DOWNTO. DO statement.

Increment/decrement:Node type ADD for TOand SUBTRACT forDOWNTO.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

20

Pascal Syntax Checker II: FOR

Demo. java -classpath classes Pascal compile -i for.txt java -classpath classes Pascal compile -i forerrors.txt

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

21

IF Statement

Example: IF (i = j) THEN t := 200

ELSE f := -200;Third child only ifthere is an ELSE.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

22

The “Dangling” ELSE

Consider: IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500

Which THEN does the ELSE pair with? Is it

IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500 or

IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500

According to Pascal syntax, the second IF is the THEN statement of the first IF IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500

Therefore, the ELSE pairs with the closest (i.e., the second) THEN.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

23

Pascal Syntax Checker II: IF

Demo. java -classpath classes Pascal compile -i if.txt java -classpath classes Pascal compile -i iftest.txt

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

24

CASE Statement

Example: CASE i+1 OF

1: j := i; 4: j := 4*i; 5, 2, 3: j := 523*i;END

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

25

CASE Statement (cont’d)

Example:

CASE i+1 OF 1: j := i; 4: j := 4*i; 5, 2, 3: j := 523*i;END

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

26

Pascal Syntax Checker II

Demo. java -classpath classes Pascal compile -i case.txt java -classpath classes Pascal compile -i caseerrors.txt

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

27

Top Down Recursive Descent Parsing

The term is very descriptive of how the parser works.

Start by parsing the topmost source language construct. For now it’s a statement. Later, it will be the program.

“Drill down” (descend) by parsing the sub-constructs. statement →assignment statement → expression →variable →

etc.

Use recursion on the way down. statement →WHILE statement → statement → etc.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

28

Top Down Recursive Descent Parsing (cont’d)

This is the technique for hand-coded parsers. Very easy to understand and write. The source language grammar is encoded in the structure of

the parser code. Close correspondence between the parser code and the

syntax diagrams.

Disadvantages Can be tedious coding. Ad hoc error handling. Big and slow!

Bottom-up parsers can be smaller and faster. Error handling can still be tricky. To be covered later in this course.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

29

Syntax and Semantics

Syntax refers to the “grammar rules” of a source language. The rules prescribe the “proper form” of its programs. Rules can be described by syntax diagrams. Syntax checking: Does this sequence of tokens follow the

syntax rules?

Semantics refers to the meaning of the token sequences according to the source language. Example: Certain sequences of tokens constitute IF

statements according to the syntax rules. The semantics of a statement determine how the statement will

be executed by the interpreter, or what code will be generated for it by the compiler.

SJSU Dept. of Computer ScienceFall 2009: September 15

CS 153: Concepts of Compiler Design© R. Mak

30

Syntax and Semantics (cont’d)

Semantic actions by the front end parser. Building symbol tables. Type checking (which we’ll do later). Building proper parse trees.

The parse trees encode type checking and operator precedence in their structures.

Semantic actions by the back end. Interpreter: The executor runs the program. Compiler: The code generator emits object code.

top related