Top Banner
Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University
32

Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

Introduction to Compilers

Professor Yihjia Tsai2006 Spring

Tamkang University

Page 2: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

2

What is a compiler?

• Translates source code to target code– Source code is typically a high level

programming language (Java, C++, etc) but does not have to be

– Target code is often a low level language like assembly or machine code but does not have to be

• Can you think of other compilers that you have used – according to this definition?

Page 3: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

3

Before we begin

• A-Z, a-z, 0-9• “ double quote• # hash• $ dollar sign• % percent• & ampersand• ‘ single quote• ( left parenthesis• ) right parenthesis

• * star• + plus• , comma• - hyphen, minus• / slash• : colon• ; semicolon• < less than• = equal

Page 4: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

4

Symbols

• > greater than• ? question mark• @ at sign• [ left (open) square

bracket• \ back slash• ] right (close) square

bracket• ^ caret, power• _ underscore

• ` back quote• { open brace• | or• } close brace• ~ tilde• . period, dot bullet

Page 5: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

5

Greek symbols

alpha beta gamma delta epsilon phi zeta theta iota kappa lambda

mu nu xi pi rho sigma tau chi psi eta omega

Page 6: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

6

Other Compilers

• Javadoc -> HTML• XML -> HTML• SQL Query output -> Table• Poscript -> PDF• High level description of a circuit -

> machine instructions to fabricate circuit

Page 7: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

The C

om

pila

tion P

roce

ss

Page 8: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

8

The analysis Stage

• Broken up into four phases– Lexical Analysis (also called scanning

or tokenization)– Parsing– Semantic Analysis– Intermediate Code Generation

Page 9: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

9

Lexing Example

double d1;double d2;d2 = d1 * 2.0;

double TOK_DOUBLE reserved wordd1 TOK_ID variable name; TOK_PUNCT has value of “;”double TOK_DOUBLE reserved wordd2 TOK_ID variable name ; TOK_PUNCT has value of “;”d2 TOK_ID variable name = TOK_OPER has value of “=”d1 TOK_ID variable name* TOK_OPER has value of “*”2.0 TOK_FLOAT_CONST has value of 2.0; TOK_PUNCT has value of “;”

lexemes

Page 10: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

10

Syntax and Semantics

• Syntax - the form or structure of the expressions – whether an expression is well formed

• Semantics – the meaning of an expression

Page 11: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

11

Syntactic Structure

• Syntax almost always expressed using some variant of a notation called a context-free grammar (CFG) or simply grammar– BNF– EBNF

Page 12: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

12

A CFG has 4 parts

• A set of tokens (lexemes), known as terminal symbols

• A set of non-terminals• A set of rules (productions) where each

production consists of a left-hand side (LHS) and a right-hand side (RHS) The LHS is a non-terminal and the RHS is a sequence of terminals and/or non-terminal symbols.

• A special non-terminal symbol designated as the start symbol

Page 13: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

13

An example of BNF syntax for real numbers

<r> ::= <ds> . <ds><ds> ::= <d> | <d> <ds><d> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7| 8 | 9

< > encloses non-terminal symbols::= 'is' or 'is made up of ' or 'derives' (sometimes denoted with an arrow ->) | or

Page 14: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

14

Example

• On the example from the previous slide:– What are the tokens?– What are the lexemes?– What are the non terminals?– What are the productions?

Page 15: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

15

Token vs. lexeme

• to·ken One that represents a group, as an employee whose presence is used to deflect from the employer criticism or accusations of discrimination.

• to·ken A basic, grammatically indivisible unit of a language such as a keyword, operator or identifier.

• lexeme A minimal unit (as a word or stem) in the lexicon of a language; `go' and `went' and `gone' and `going' are all members of the English lexeme `go'

• lexeme A minimal lexical unit of a language. Lexical analysis converts strings in a language into a list of lexemes. For a programming language these word-like pieces would include keywords, identifiers, literals and punctuations. The lexemes are then passed to the parser for syntactic analysis.

Page 16: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

16

BNF Points

• A non terminal can have more than RHS or an OR can be used

• Lists or sequences are expressed via recursion

• A derivation is just a repeated set of production (rule) applications

• Examples

Page 17: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

17

Example Grammar

<program> -> <stmts><stmts> -> <stmt> | <stmt> ; <stmts><stmt> -> <var> = <expr><var> -> a | b | c | d<expr> -> <term> + <term> | <term> - <term><term> -> <var> | const

Page 18: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

18

Example Derivation

<program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const

Page 19: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

19

Parse Trees• Alternative representation for a

derivation• Example parse tree for the previous

example

var expr=

term+

var

b

const

stmts

stmt

terma

Page 20: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

20

Another Example

Expression -> Expression + Expression | Expression - Expression | ... Variable | Constant |...Variable -> T_IDENTIFIERConstant -> T_INTCONSTANT | T_DOUBLECONSTANT

Page 21: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

21

The Parse

Expression -> Expression + Expression -> Variable + Expression

-> T_IDENTIFIER + Expression -> T_IDENTIFIER + Constant -> T_IDENTIFIER + T_INTCONSTANT

a + 2

Page 22: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

22

Parse Trees

PS -> P | P PS

P -> | '(' PS ')' | '<' PS '>' | '[' PS ']'

What’s the parsetree for this statement ? < [ ] [ < > ] >

Page 23: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

23

EBNF - Extended BNF

• Like BNF except that• Non-terminals start w/ uppercase • Parens are used for grouping

terminals • Braces {} represent zero or more

occurrences (iteration ) • Brackets [] represent an optional

construct , that is a construct that appears either once or not at all.

Page 24: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

24

EBNF example

Exp -> Term { ('+' | '-') Term }Term -> Factor { ('*' | '/') Factor }Factor -> '(' Exp ')' | variable | constant

Page 25: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

25

EBNF/BNF

• EBNF and BNF are equivalent• How can {} be expressed in BNF?• How can ( ) be expressed?• How can [ ] be expressed?

Page 26: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

26

Semantic Analysis

• The syntactically correct parse tree (or derivation) is checked for semantic errors

• Check for constructs that while valid syntax do not obey the semantic rules of the source language.

• Examples:– Use of an undeclared/un-initialized variable– Function called with improper arguments– Incompatible operands and type mismatches,

Page 27: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

27

Examples

int i;int j;i = i + 2;

int arr[2], c;c = arr * 10;

Most semantic analysis pertains to the checking of types.

void fun1(int i);double d;d = fun1(2.1);

Page 28: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

28

Intermediate Code Generation

• Where the intermediate representation of the source program is created.

• The representation can have a variety of forms, but a common one is called three-address code (TAC)

• Like assembly – the TAC is a sequence of simple instructions, each of which can have at most three operands.

Page 29: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

29

Example

_t1 = b * c_t2 = b * d_t3 = _t1 + _t2a = _t3

a = b * c + b * d

Note: temps

Page 30: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

30

Another Example

_t1 = a > b if _t1 goto L0 _t2 = a - c a = _t2L0: t3 = b * c c = _t3

if (a <= b) a = a - c;c = b * c;

Note TempsSymbolic addresses

Page 31: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

31

Next Time

• Finish introduction to compilation stages

• Read Appel Chapter 1, and 2 if you have not already done so.

• What is a splay tree?

Page 32: Introduction to Compilers Professor Yihjia Tsai 2006 Spring Tamkang University.

32

Selected References

• Appel, A., Modern Compiler Implementation In Java (2nd Ed), Cambridge University Press, 2002. ISBN 052182060X.

• Aho, A.V., R. Sethi, and J.D. Ullman, Compilers Principles, Techniques and Tools, Addison-Wesley, 1988. ISBN 0-201-10088-6.

• Muchnick, S., Advanced Compiler Design and Implementation, Morgan Kaufmann, 1998. ISBN 1-55860-320-4.