Compiler Construction Introduction and overview G¨ orel Hedin Reviderad 2013-01-22 2013 Compiler Construction 2013 F01-1
Compiler ConstructionIntroduction and overview
Gorel Hedin
Reviderad 2013-01-22
2013
Compiler Construction 2013 F01-1
Agenda
Course registration, structure, etc.
Course overview
Compiler Construction 2013 F01-2
Course registration
You need to
Confirm registration by signing the Registration Form
To unregister
Email me
Compiler Construction 2013 F01-3
Prerequisites
Object-oriented programming and Java
Algorithms and Data structures (recursion, trees, lists, hashtables, . . . )
Compiler Construction 2013 F01-4
Course information
Web page: http://cs.lth.se/eda180
will be updated during the course
LiteratureCourse material, will be made available on the web site.
Lectures, seminars, labs, project, articlesNot handed out - print yourself.
Textbook
A.W. Appel: Modern Compiler Implementation in Java, 2ndEdition, Cambridge University Press, 2002, ISBN:0-521-82060-X.Available as an e-book viahttp://www.lub.lu.se/en/search/lubsearch.html
Compiler Construction 2013 F01-5
Course structure
14 lectures
Tuesday 15-17, MA:020Wednesday 10-12, E:A
5 seminars (give extra points on exam)
Thursday 10-12, MA:026Start this week
6 computer assignments / lab sessions (mandatory)
Thursday 13-15 or Friday 10-12 (E:Hacke)Sign up by Thursday Jan 24 (see course web)Start next weekMust be completed before exam and course project
Written exam, March 13
Course project, VT2
Compiler Construction 2013 F01-6
People helping with the course
Lectures:
Gorel HedinEmma Soderberg
Guest lectures:
Roger Henriksson (automatic memory management)Jonas Skeppstedt (code optimization)
Seminars
Emma Soderberg
Programming assignments and lab sessions
Niklas ForsJesper Oqvist
Compiler Construction 2013 F01-7
Seminars
Active participation gives extra points at the exam
Before the seminar
Try to solve the problems at home.Write down your solutions, bring them with you, be preparedto present them at the seminar.
At the seminar
Mark the problems you are willing to present solutions for.The seminar leader selects some students to present theirsolutions. Discussion of the solutions.
At the written exam (Note! Only for exams this year)
Your markings will give you a maximum of 10% extra points atthe written exam.
Compiler Construction 2013 F01-8
Programming assignments / Lab sessions
Work in pairs
Use the lecture break to form pairs!
Make the preparations before each lab session.
If you complete the assignment in advance, you must anywaygo to the lab session to get it approved.
Compiler Construction 2013 F01-9
Examination
Exams take place
Wednesday, March 13, 8-13, Sparta:DFriday, August 30, 8-13, Victoriastadion 1A (1 week advanceregistration required)
Prerequisites
Completed programming assignments
Compiler Construction 2013 F01-10
Project
Standard project
Design of a small procedural languageImplementation of a compiler from source text to Intelassembly code
Work in pairs
Deadlines
Intermediate deadlines given laterFinal deadline for completed and approved project: May 3rd
Compiler Construction 2013 F01-11
Project outcome
compiler
assembly code
source code
csum = a + b + 1;
movl a, %eax
addl b, %eax
addl $1, %eax
movl %eax, csum
Compiler Construction 2013 F01-12
Project outcome
compiler
assembly code
source code
csum = a + b + 1;
movl a, %eax
addl b, %eax
addl $1, %eax
movl %eax, csum
Compiler Construction 2013 F01-12
What happens after compilation?
compiler
assembly code
source code
assembler
linkerloader
object code
machinecode
machinememory
library object code
Compiler Construction 2013 F01-13
A closer look at the compiler
lexicalanalysis
syntacticanalysis
semanticanalysis
intermediatecode generation
machine codegeneration
optimization
source code
machinecode
Compiler Construction 2013 F01-14
Intermediate representations
lexicalanalysis
syntacticanalysis
semanticanalysis
intermediatecode generation
machine codegeneration
optimization
AST
source code
machinecode
tokens intermediate code
intermediate code
attributedAST
analysis synthesis
Compiler Construction 2013 F01-15
Front and back end
lexicalanalysis
syntacticanalysis
semanticanalysis
intermediatecode generation
machine codegeneration
optimization
source code
machinecode
front end back end
Compiler Construction 2013 F01-16
Intermediate code
intermediatecode
FrontEndL
BackEndIntel
Compiler Construction 2013 F01-17
Several front and back ends
intermediatecode
FrontEndL
BackEndIntel
FrontEndC FrontEndPL0
BackEndMIPS Interpreter
Compiler Construction 2013 F01-18
Why?
It is more rational to implement m front ends + n back endsthan m ∗ n compilers.
Many optimizations are best performed on intermediate code.
It may be easier to debug the front end using an interpreterthan a target machine.
Compiler Construction 2013 F01-19
Compilation and Interpretation
A compiler translates a high level program to lowlevel/machine code.
An interpreter executes a high/low level program by callingone procedure for each program construct.
An interpreter may use a JIT (“just in time”) compiler tocompile all or parts of the the program into machine codeduring execution.
Compiler Construction 2013 F01-20
Program representations
lexicalanalysis
syntacticanalysis
semanticanalysis
intermediatecode generation
machine codegeneration
optimization
AST
source code
machinecode
tokens intermediate code
intermediate code
attributedAST
Compiler Construction 2013 F01-21
Lexical analysis (scanning)
Source text Tokens
while (k<=n) {
sum=sum+k;
k=k+1;
}
A token is a symbolic name, sometimes with an attribute.A lexeme is a string corresponding to a token.
Compiler Construction 2013 F01-22
Lexical analysis (scanning)
Source text Tokens
while (k<=n) { WHILE LPAR ID(k) LEQ ID(n) RPAR LBRA
sum=sum+k;
k=k+1;
}
A token is a symbolic name, sometimes with an attribute.A lexeme is a string corresponding to a token.
Compiler Construction 2013 F01-23
Lexical analysis (scanning)
Source text Tokens
while (k<=n) { WHILE LPAR ID(k) LEQ ID(n) RPAR LBRA
sum=sum+k; ID(sum) EQ ID(sum) PLUS ID(k) SEMI
k=k+1; ID(k) EQ ID(k) PLUS INT(1) SEMI
} RBRA
Compiler Construction 2013 F01-24
Syntactic analysis (parsing)
Compiler Construction 2013 F01-25
Syntactic analysis (parsing)
Compiler Construction 2013 F01-26
Abstract Syntax Tree (AST)
used for program representation inside tools
very similar to the parse tree, but
contains only essential tokenshas a simpler more natural structure
often represented by a typed object-oriented model
abstract classes (statements, expressions, ...)concrete classes (while, if, add, subtract, ...)
Compiler Construction 2013 F01-27
Abstract Syntax Tree (AST)
used for program representation inside tools
very similar to the parse tree, but
contains only essential tokenshas a simpler more natural structure
often represented by a typed object-oriented model
abstract classes (statements, expressions, ...)concrete classes (while, if, add, subtract, ...)
Compiler Construction 2013 F01-27
Parse tree – spans all tokens
Compiler Construction 2013 F01-28
Abstract syntax tree – only essential structure and tokens
Compiler Construction 2013 F01-29
AST class hierarchy
Create class hierarchies for statements and expressions!
Invent names for suitable abstract classes!
Which methods are required to traverse the AST?
Compiler Construction 2013 F01-30
Draw the class hierarchy
Stmt
WhileStmtgetExpr()getStmt()
AssignmentgetId()getExpr()
CompoundStmtgetNrOfStmts()getStmt(int)
Expr
AddgetExpr1()getExpr2()
LessEqualgetExpr1()getExpr2()
IntgetINT()
IdgetID()
Compiler Construction 2013 F01-31
Draw the class hierarchy
Stmt
WhileStmtgetExpr()getStmt()
AssignmentgetId()getExpr()
CompoundStmtgetNrOfStmts()getStmt(int)
Expr
AddgetExpr1()getExpr2()
LessEqualgetExpr1()getExpr2()
IntgetINT()
IdgetID()
Compiler Construction 2013 F01-31
Draw the class hierarchy
Stmt
WhileStmtgetExpr()getStmt()
AssignmentgetId()getExpr()
CompoundStmtgetNrOfStmts()getStmt(int)
Expr
AddgetExpr1()getExpr2()
LessEqualgetExpr1()getExpr2()
IntgetINT()
IdgetID()
Compiler Construction 2013 F01-31
Semantic analysis
Analyze the AST, e.g.
Which variable corresponds to which declaration?
What is the type of an expression?
Are there compile time errors in the program?
Compiler Construction 2013 F01-32
Formalisms we will cover
Regular expressions for
defining tokensautomatic generation of scanners
Context-free grammars for
defining concrete syntax treesautomatic generation of parsers
Abstract Grammars for
defining abstract syntax treesautomatic generation of Java classes
Attribute Grammars for
defining properties of AST nodesautomatic evaluation of the attributes
Aspect Modules for
defining fields, methods, and attributes in separate modulesautomatic weaving into Java classes
Compiler Construction 2013 F01-33
Compiler tools we will use
JavaCC (Sun/Open source)
Scanner and parser generator
JJTree (Sun/Open source)
adds AST building to JavaCCimplements the Visitor pattern for ASTs
JastAdd (LTH/Open source)
generates Java classessupports static aspect oriented programmingsupports attribute grammars
as (GNU/Open source)
translates assembly code to machine code
Compiler Construction 2013 F01-34
Other tools we will use
Ant (Apache/Open source)
Software system builder
JUnit (Object Mentor/Open source)
testing framework
Gdb (GNU/Open source)
debugger
Compiler Construction 2013 F01-35
Synthesis
Runtime systems
How are variables accessed and procedures called?How are objects and classes represented?How is memory reused?
Intermediate code generation
Straight-forward mapping from ASTUse unlimited number of registers (temporaries)
Optimization
Only brief overview (see EDA230 for detailed treatment)
Machine code generation
Instruction selectionRegister allocation
Compiler Construction 2013 F01-36
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Paradigms
Imperative programming
ProceduralObject oriented
Declarative programming
FunctionalLogicalConstraintRegular expressionsContext-free grammarsAttribute grammars
Hybrid languages
JastAddScala...
Compiler Construction 2013 F01-37
Applications of compiler construction
Traditional compilers from source to assembly
Source-to-source translators, preprocessors
Interpreters and virtual machines
Integrated programming environments
Analysis tools
Refactoring tools
Domain-specific languages
Compiler Construction 2013 F01-38
Applications of compiler construction
Traditional compilers from source to assembly
Source-to-source translators, preprocessors
Interpreters and virtual machines
Integrated programming environments
Analysis tools
Refactoring tools
Domain-specific languages
Compiler Construction 2013 F01-38
Related research at LTH
Extensible compiler tools (Gorel Hedin)
Real-time garbage collection (Roger Henriksson)
Code optimization for multiprocessors (Jonas Skeppstedt)
Natural language processing (Pierre Nugues)
Constraint solvers (Krzysztof Kuchcinski)
Data-flow languages (Jorn Janneck)
Languages for pervasive systems (Boris Magnusson)
Languages for physical modeling (Johan Akesson)
Compiler Construction 2013 F01-39
Course goals
After this course...
You should be able to use
regular expressionscontext-free grammarsabstract grammars
You should be able describe
attribute grammarsruntime systems and garbage collectionsome code optimizations
You should be able to build a compiler, where you
use a parser generatormake semantic analysisdo code generation
You should be able to program, using
static aspect oriented programmingthe visitor pattern
Compiler Construction 2013 F01-40
Readings
F1: Introduction
Appel, chapter 1-1.2
F2: Regular expressions
Appel, chapter 2Appel, recommended exercises: 2.1 – 2.8Try solve the problems in Seminar 1
Compiler Construction 2013 F01-41
Review questions
Which are the major compilation phases?
What is the difference between the analysis and synthesisphases?
Why do we use intermediate code?
What is the advantage of separating the front and back ends?
What is
a lexeme,a token,a parse tree,an abstract syntax tree,intermediate code,assembly code?
Compiler Construction 2013 F01-42