CST320 - Lec 1 1 Why study compilers? Why study compilers? Ties lots of things you know together: – Theory (finite automata, grammars) Theory (finite automata, grammars) – Data structures Data structures – Modularization Modularization – Utilization of software tools Utilization of software tools You might build a parser. The theory of computation/formal language still applies today. – As long as we still program with 1-D text. Helps you to be a better programmer
28
Embed
CST320 - Lec 11 Why study compilers? n n Ties lots of things you know together: –Theory (finite automata, grammars) –Data structures –Modularization –Utilization.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CST320 - Lec 1 1
Why study compilers?Why study compilers?
Ties lots of things you know together:– Theory (finite automata, grammars)Theory (finite automata, grammars)– Data structuresData structures– ModularizationModularization– Utilization of software toolsUtilization of software tools
You might build a parser. The theory of computation/formal language still
applies today. – As long as we still program with 1-D text.
Helps you to be a better programmer
2
One-dimensional TextOne-dimensional Text
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
int x;cin >> x;if(x>5) cout << “Hello”; else …
The formatting has no impact on the meaning of program
3
What is a translator?What is a translator?
Takes input (Takes input (SOURCESOURCE) and produces output ) and produces output ((TARGETTARGET))
SOURCE TARGET
ERROR
4
Types of Target Code:Types of Target Code:
““Pure” machine codePure” machine code» No operating system required.No operating system required.
» No library routines.No library routines.
» Good for developing software for new hardware.Good for developing software for new hardware.
““Augmented” codeAugmented” code» More commonMore common
» Executable code relies on o/s provided support and Executable code relies on o/s provided support and library routines loaded as program is prepared to library routines loaded as program is prepared to execute.execute.
5
Conventional TranslatorConventional Translator
skeletal source
programpreprocessor
source
program
library, relocatable object files
compiler
assembler
target assembly program
loader / linker
relocatable machine
code
absolute machine
code
6
Types of Target Code (cont.)Types of Target Code (cont.)
Virtual codeVirtual code» Code consists entirely of “virtual” instructions.Code consists entirely of “virtual” instructions.
» Used by “Re-Targetable” compilersUsed by “Re-Targetable” compilers Transporting to a new platform only requires Transporting to a new platform only requires
implementing a virtual machine on the new hardware.implementing a virtual machine on the new hardware.
» Imperative, ALGOL-like languagesImperative, ALGOL-like languages» Other paradigmsOther paradigms
InterpretersInterpreters Macro processorsMacro processors Text formattersText formatters Silicon compilersSilicon compilers
9
Types of Translators (cont.)Types of Translators (cont.)
Visual programming language Visual programming language InterfaceInterface
– DatabaseDatabase– User interfaceUser interface– Operating SystemOperating System
10
Conventional TranslatorConventional Translator
skeletal source
programpreprocessor
source
program
library, relocatable object files
compiler
assembler
target assembly program
loader / linker
relocatable machine
code
absolute machine
code
11
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
12
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Tokens
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
int x ;
cin >> x ;
if ( x > 5 )
cout << “Hello” ;
else
cout << “BOO” ;
What about white spaces? Do they matter?
13
Tokenize First or as needed?Tokenize First or as needed?
int x;
cin >> x;
if(x>5)
cout << “Hello”;
else
cout << “BOO”;
intdatatype
xID
;symbol
cin >>
Tokens = Meaningful units in a program
Value/Type pairs
14
Tokenize First or as needed?Tokenize First or as needed?
Array<Array<int>> someArray;
Array < int
>
Array<Array<int> > someArray;
Array < int >
>>
15
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Syntactic Structure
Parse Tree
16
Parse Tree (Parser)Parse Tree (Parser)
int x ; cin >>
datatype ID
DataDeclaration
Program
17
Who is responsible for errors?Who is responsible for errors?
int x$y;int x$y;
int 32xy;int 32xy;
45b45b
45ab45ab
x = x @ y;x = x @ y;
Lexical Errors / Token Errors?
18
Who is responsible for errors?Who is responsible for errors?
X = ;X = ;
Y = x +;Y = x +;
Z = [;Z = [;
Syntax errors
19
Who is responsible for errors?Who is responsible for errors?
45ab 45ab
– One wrong token?One wrong token?
– Two tokens (45 & ab)? Are whitespaces needed?Two tokens (45 & ab)? Are whitespaces needed?
Either way is okay. Either way is okay.
– Lexical analyzer can catch the illegal token (45ab)Lexical analyzer can catch the illegal token (45ab)
– Parser can catch the syntax error. Most likely 45 Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.followed by ab will not be syntactically correct.
20
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Symbol Table
int x;
cin >> x;
if(x>5)
x = “SHERRY”;
else
cout << “BOO”;
21
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
22
Structure of Compilers Structure of Compilers
Lexical Analyzer (scanner)
Source Program
Syntax Analysis(Parser)
Tokens Semantic Analysis
Syntactic Structure
Optimizer
Code Generator
Intermediate Representation
Target machine code
Symbol Table
23
Translation Steps:Translation Steps:
Recognize when input is available. Break input into individual components. Merge individual pieces into meaningful
Break input into individual components.(lexical analysis)
Merge individual pieces into meaningful structures. (parsing)
Process structures. (semantic analysis) Produce output. (code generation)
25
CompilersCompilers
Two major tasks:Two major tasks:– Analysis of sourceAnalysis of source– Synthesis of targetSynthesis of target
Syntax-directed translationSyntax-directed translation– Compilation process driven by syntactic Compilation process driven by syntactic
structure of the source being translatedstructure of the source being translated
26
InterpretersInterpreters
Executes source program without explicitly Executes source program without explicitly translating to target code.translating to target code.
Control and memory management reside in Control and memory management reside in interpreter, not user program.interpreter, not user program.
Allow:Allow:– Modification of program as it executes.Modification of program as it executes.– Dynamic typing of variablesDynamic typing of variables– PortabilityPortability
History of Modern CompilersHistory of Modern Compilers Front and Back endsFront and Back ends One pass vs. Multiple passesOne pass vs. Multiple passes Compiler Construction Tools Compiler Construction Tools