Top Banner
1 Compilers 40-414: Compiler Design http://sharif.edu/~sani/courses/compiler/ Computer Engineering Dept., Sharif University Instructor: GholamReza GHASSEM SANI
43

40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Dec 05, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

1

Compilers

• 40-414: Compiler Design

http://sharif.edu/~sani/courses/compiler/

• Computer Engineering Dept., Sharif University

• Instructor: GholamReza GHASSEM SANI

Page 2: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

2

Compilers

• Lectures:– Time: Sundays and Tuesdays, 16:30-18:00

– Location: https://vc.sharif.edu/ch/sani, or

https://vclass.ecourse.sharif.edu/ch/sani

• Evaluation:

4 Written Assignments, and 20%4 Programming Assignments 40%2 Exams 40%

Page 3: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

3

Acknowledgement

• Most Lecture Notes are from a similar course

(i.e., CS-143) taught by Professor Alex Aiken in

Stanford University

Page 4: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 4

Text

• The Purple Dragon Book

• Aho, Lam, Sethi & Ullman

• Not required– But a useful reference

Page 5: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

5

The Course Project

• A big project

• … in 4 rather easy parts

• Start early!

Prof. Aiken

Page 6: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

6

Academic Honesty

• Don’t use work from uncited sources– Including old code

• We use plagiarism detection software– many cases in past offerings

PLAGIARISM

Prof. Aiken

Page 7: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

7

How are Languages Implemented?

• Two major strategies:– Interpreters (older)

– Compilers (newer)

• Interpreters run programs “as is”– Little or no preprocessing

• Compilers do extensive preprocessing

Prof. Aiken

Page 8: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

8

CompilerSource

Program

Target

Program

Errors

Target ProgramInput Output

Compilers

Page 9: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

9

Interpreter

Input

Output

Source

Program

• Translates line by line

• Executes each translated line immediately

• Execution is slower because translation is repeated

Interpreters

Page 10: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

10

A Hybrid Compiler

TranslatorSource

Program

Intermediate

Program

Errors

Virtual MachineInput Output

Page 11: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

11

Different Types of Compilers

Single Pass

Multiple Pass

Construction

Page 12: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 12

History of Compilers

• 1954 IBM develops the 704– Successor to the 701

• Problem– Software costs exceeded

hardware costs!

• All programming done in assembly

Page 13: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 13

The Solution

• “Speedcoding”

JohnBackus

• an early example of an interpreter

• developed in 1953 by John Backus

• much faster way of developing programs

• programs were 10-20 times slower than

hand-written assembly

• needed 300 bytes = 30% machine memory

Page 14: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 14

FORTRAN I

JohnBackus

• FORmula TRANslation Project

• FORTRAN ran from 1954 To 1957

• By 1958, over 50 percent of all of

programs were in FORTRAN

Page 15: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 15

FORTRAN I

• The first compiler– Huge impact on computer science

• Led to an enormous body of theoretical work

• Modern compilers preserve the outlines of FORTRAN I

Page 16: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 16

The Structure of Fortran Compiler

1. Lexical Analysis

2. Parsing

3. Semantic Analysis

4. Optimization

5. Code Generation

The first 3, at least, can be understood by analogy to how humans comprehend English.

Page 17: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 17

Lexical Analysis

• First step: recognize words.– Smallest unit above letters

This is a sentence.

Page 18: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 18

More Lexical Analysis

• Lexical analysis is not trivial. Consider:

ist his ase nte nce

Page 19: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 19

And More Lexical Analysis

• Lexical analyzer divides program text into “words” or “tokens”

If x == y then z = 1; else z = 2;• Units:

– Keywords { if, then, else }

– Identifiers { x, y, z }

– Numbers { 1, 2 }

– Operators { ==, = }

– Separators { blanks, ; }

Page 20: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 20

Parsing

• Once words are understood, the next step is to understand sentence structure

• Parsing = Diagramming Sentences– The diagram is a tree

Page 21: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 21

Diagramming a Sentence

This line is a longer sentence

verbarticle noun article adjective noun

subject object

sentence

Page 22: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 22

Parsing Programs

• Parsing program expressions is the same

• Consider:

If x == y then z = 1; else z = 2;

• Diagrammed:

if-then-else

x y z 1 z 2==

assignrelation assign

predicate else-stmtthen-stmt

Page 23: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 23

Semantic Analysis

• Once sentence structure is understood, we can try to understand “meaning”– But meaning is too hard for compilers

• Compilers perform limited semantic analysis to catch inconsistencies

Page 24: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 24

Semantic Analysis in English

• Example:

Jack said Jerry left his assignment at home.

What does “his” refer to? Jack or Jerry?

• Even worse:

Jack said Jack left his assignment at home?

How many Jacks are there? (1, 2 , or 3)

Which one left the assignment?

Page 25: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 25

Semantic Analysis in Programming

• Programming languages define strict rules to avoid such ambiguities

• This C++ code prints “4”; the inner definition is used

{

int Jack = 3;

{

int Jack = 4;

cout << Jack;

}

}

Page 26: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 26

More Semantic Analysis

• Compilers perform many semantic checks besides variable bindings

• Example:

Jack left her homework at home.

• A “type mismatch” between her and Jack; we know they are different people– Presumably Jack is male

Page 27: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

• No strong counterpart in English, – but a little bit like editing

– but akin to editing

• Automatically modify programs so that they– Run faster

– Use less memory

• Your project has no optimization component :D

Prof. Aiken 27

Optimization

Page 28: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

28

Optimization Example

X = Y * 0 is the same as X = 0

NOT ALWAYS CORRECT

NaN

NaN* 0 = NaN

Prof. Aiken

Page 29: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 29

Code Generation

• Produces assembly code (usually)

• A translation into another language– Analogous to human translation

Page 30: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

Prof. Aiken 30

Compilers Today

• The overall structure of almost every compiler adheres to our outline

• The proportions have changed since FORTRAN– Early: lexing, parsing most complex, expensive

– Today: optimization dominates all other phases, lexing and parsing are cheap

L P S O C

L P S O C

Page 31: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

31

Compiler Front-end and Back-end

Source Program

Lexical analyzer1

Syntax Analyzer2

Semantic Analyzer3

Intermediate

Code Generator4

Code Optimizer5

Code Generator

Target Program

Symbol-table

Manager

Error Handler

Analyses

Peephole Optimization7

1, 2, 3, 4, 5 : Front-End

6, 7 : Back-End

6Syntheses

Page 32: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

32

Front-End

• Front end maps source code into an IR representation

• Back end maps IR onto machine code

• Simplifies retargeting

Front endSource

code

Machine

code

errors

IRBack end

Page 33: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

33

Front-End (Cont.)

Scanner:

• Maps characters into tokens – the basic unit of syntax

o x = x + y becomes <id, x> <=, > <id, x> <+, > <id, y>

• Eliminate white space (tabs, blanks, comments)

ScannerSource

code

Parse Tree

errors

tokensParser

Page 34: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

34

Front-End (Cont.)

Parser:

• Recognize context-free syntax

• Guide context-sensitive analysis

• Produce meaningful error messages

ScannerSource

code

Parse Tree

errors

tokensParser

Page 35: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

35

Back-End

Back-End:

• Translate IR into machine code

• Choose instructions for each IR operation

• Decide what to keep in registers at each point

Instruction

selectionIRMachine code

errors

Register

Allocation

Page 36: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

36

Two Main Components of Back-End

Code Generator:

• Produce compact fast code

• Use available addressing modes

Code

GenerationIRMachine code

errors

Peephole

Optimization

Page 37: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

37

Back-End (Cont.)

Peephole Optimization:

• Limited resources

• Optimal allocation is difficult

Code

GenerationIRMachine code

errors

Peephole

Optimization

Page 38: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

38

Phase 1. Lexical Analysis

Easiest Analysis - Identify tokens which are the basic building blocks

For

Example:

All are tokens

Blanks, Line breaks, etc. are scanned out

Position := initial + rate * 60 ;_______ __ _____ _ ___ _ __ _

Page 39: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

39

Phase 2. Syntax Analysis or Parsing

Parse Tree:

identifier

identifier

expression

identifier

expression

number

expression

expression

expression

assignment

statement

position

:=

+

*

60

initial

rate

Nodes of tree are constructed using a Grammar for the source language

Page 40: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

40

• Finds Semantic Errors

• One of the Most Important Activities in This Phase:

• Type Checking - Legality of Operands

position

initial

rate

:=

+

*

60

Syntax Tree

position

initial

rate

:=

+

*

inttoreal

60

Conversion Action

Page 41: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

41

Supporting Phases

• Symbol table creation / maintenance

– Contains info (address, type, scope, args) on certain Tokens, typically identifiers

– Data structure created/initialized during lexical analysis; and updated during later analysis & synthesis

• Error handling

– Detection of different errors which correspond to all phases; and deciding what happens when an error is found

Page 42: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

42

An example of the Entire Process

Error

Handler

position = initial + rate * 60

lexical analyzer (Scanner)

syntax analyzer (Parser)

semantic analyzer

intermediate code generator

<id, 1> < = > <id, 2> < + > <id, 3> < * > <num, 60 >

:=

<id, 1><id, 2>

<id, 3>

+

*

<num, 60>

:=

<id, 1><id, 2>

<id, 3>

+

*

inttoreal

<num, 60>

Symbol Table

1 position real …

2 initial real …

3 rate real …

Page 43: 40-414: Compiler Designsharif.edu/~sani/courses/compiler/lecture01_introduction.pdfThe Structure of Fortran Compiler 1. Lexical Analysis 2. Parsing 3. Semantic Analysis 4. Optimization

43

An example of the Entire Process

Error

Handlerintermediate code generator

code optimizer

final code generator

t1 := inttoreal(60)

t2 := id3 * t1

t3 := id2 + t2

id1 := t3

t1 := id3 * 60.0

id1 := id2 + t1

LD R1, id3

MUL R1, R1, #60.0

LD R2, id2

ADD R1, R1, R2

ST id1, R1

1 position real …

2 initial real …

3 rate real …

Symbol Table

3 address codes

position = initial + rate * 60