Compilers and Language Processing Tools Summer Term 2011 Prof. Dr. Arnd Poetzsch-Heffter Software Technology Group TU Kaiserslautern c Prof. Dr. Arnd Poetzsch-Heffter 1 Content of Lecture 1. Introduction 2. Syntax and Type Analysis 2.1 Lexical Analysis 2.2 Context-Free Syntax Analysis 2.3 Context-Dependent Analysis 3. Translation to Target Language 3.1 Translation of Imperative Language Constructs 3.2 Translation of Object-Oriented Language Constructs 4. Selected Aspects of Compilers 4.1 Intermediate Languages 4.2 Optimization 4.3 Data Flow Analysis 4.4 Register Allocation 4.5 Code Generation 5. Garbage Collection 6. XML Processing (DOM, SAX, XSLT) c Prof. Dr. Arnd Poetzsch-Heffter 2 3. Translation to Target Language c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 3 Chapter Outline 3. Translation to Target Language 3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages 3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions 3.1.5 Translation of Statements 3.1.6 Translation of Procedures and Local Objects 3.2 Translation of Object-Oriented Language Constructs 3.2.1 Concepts of Object-Oriented Programming Languages 3.2.2 Translation with Procedural Languages 3.2.3 Translation of Classes 3.2.4 Problems of Multiple Inheritance 3.2.5 Further Aspects of Object-Oriented Languages 3.2.6 Summary - A Simple Compiler c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 4 Translation to Target Language Focus: • Differences between source languages and target languages/target machines • Most important translation techniques for different programing paradigms (procedural/object-oriented) c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 5 Translation to Target Language (2) Learning Objectives: • Overview of imperative and procedural language constructs • Typical language constructs of assembler languages • Translation techniques for procedural language constructs • Translation of object-oriented language constructs c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 6 Translation of Imperative Language Constructs 3.1 Translation of Imperative Language Constructs c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 7 Translation of Imperative Language Constructs Section Outline 3.1 Translation of Imperative Language Constructs 3.1.1 Language Constructs of Procedural Language 3.1.2 Assembly and Machine Languages 3.1.3 Translation of Variables and Data Types 3.1.4 Translation of Expressions 3.1.5 Translation of Statements 3.1.6 Translation of Procedures and Local Objects c Prof. Dr. Arnd Poetzsch-Heffter Translation to Target Language 8
17
Embed
3. Translation to Target Language - - TU Kaiserslautern · Arithmetic-Logic Unit (ALU) Floating-Point Unit (FPU) 32 Registers (inkl. stack pointer, frame pointer, global pointer,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Compilers and Language Processing ToolsSummer Term 2011
3. Translation to Target Language3.1 Translation of Imperative Language Constructs
3.1.1 Language Constructs of Procedural Language3.1.2 Assembly and Machine Languages3.1.3 Translation of Variables and Data Types3.1.4 Translation of Expressions3.1.5 Translation of Statements3.1.6 Translation of Procedures and Local Objects
3.2 Translation of Object-Oriented Language Constructs3.2.1 Concepts of Object-Oriented Programming Languages3.2.2 Translation with Procedural Languages3.2.3 Translation of Classes3.2.4 Problems of Multiple Inheritance3.2.5 Further Aspects of Object-Oriented Languages3.2.6 Summary - A Simple Compiler
3.1 Translation of Imperative Language Constructs3.1.1 Language Constructs of Procedural Language3.1.2 Assembly and Machine Languages3.1.3 Translation of Variables and Data Types3.1.4 Translation of Expressions3.1.5 Translation of Statements3.1.6 Translation of Procedures and Local Objects
Translation of Imperative Language Constructs Assembly and Machine Languages
The MIPS Assembler
MIPS - Microprocessor without interlocked pipeline stages
• RISC Architecture, originally 32 bit (since 1991 64bit)• developed by John Hennessy (Stanford) starting 1981• MARS Simulatorhttp://courses.missouristate.edu/KenVollmar/MARS/
Translation of Imperative Language Constructs Assembly and Machine Languages
Data Types and Literals in MIPS Assembly Language
Data Types
• Instructions are all 32 bits• byte (8 bits), halfword (2 bytes), word (4 bytes)• integer (1 word storage)• single precision floats (1 word storage)• double precision floats (2 word storage)
Literals
• Integers (e.g. 4, 2, -236, 0x44)• Floats (e.g. 3.41, -0.323e5)• Characters in single quotes, e.g. ’b’• Strings in double quotes, e.g. "Hello World"
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Registers
No Name P* Description0 $zero - the constant 01 $at - assembler temporary (reserved by the assembler)2-3 $v0, $v1 no values for function results and expression evaluation4-7 $a0 - $a3 no arguments for subroutine calls8-15 $t0 - $t7 no temporaries16-23 $s0 - $s7 yes saved temporaries24-25 $t8 - $t9 no additional temporaries26-27 $k0, $k1 no reserved for OS kernel28 $gp yes global pointer29 $sp yes stack pointer30 $fp yes frame pointer31 $ra yes return address
Translation of Imperative Language Constructs Assembly and Machine Languages
List of System Services
Service Code in $v0 Arguments
print integer 1 $a0 = integer to printprint string 4 $a0 = address of
null-terminated string to printexit (terminate execution) 10print character 11 $a0 = character to printexit2 (terminate with value) 17 $a0 = termination result
Translation of Imperative Language Constructs Assembly and Machine Languages
MIPS Program
# sp + 0 : i# sp + 4 : res# sp + 5 : base address of a[3]# sp + 8 : base address of b[3]main:addi $sp, $sp, -12 # make space for the variablesli $t1, 2sw $t1, 0($sp) # i = 2li $t1, 1sb $t1, 4($sp) # set res at sp +4
Translation of Imperative Language Constructs Assembly and Machine Languages
Translation to MIPS
Remarks:The example illustrates typical translation tasks:• Translation of data types, memory management, addressing• Translation of expressions, management of intermediate results,
mapping of operations of the source language to operations of thetarget language
• Translation of statements by implementation with jumps• Bad code quality with simple systematic approach
Translation of Imperative Language Constructs Translation of Variables and Data Types
Translation of variables and data types (2)
The translation of variables and data types comprises:
• handling of primitive data types• conversion of data types (e.g. int→ float)• memory organisation• translation of arrays• translation of records and classes• implementation of dynamic objects
Translation of Imperative Language Constructs Translation of Variables and Data Types
Primitive data types
Usually, the primitive data types of source languages are supported bythe target machine:• int, long→ 4 byte word with integer arithmetic• float, double→ accordingly
Potentially, data types have to be encoded:• boolean→ 1 byte or 4 byte words
Problem, if target machine does not comply to requirements of sourcelanguage, e.g.• floating point arithmetic is not handled according to IEEE standard• overflows are not dealt with correctly
(cmp. Java FP-strict expressions)• operations for conversion are missing on target machine
Translation of Imperative Language Constructs Translation of Variables and Data Types
Code Generation for Array Access (3)
Operations for attribution:• lkupRA: Ident × SymTab→ Address• lkupSZL: Ident × SymTab→ IntList• + : List concatenation, for an element e, [e] is the list containing
only e.
In the following, the SymTab attribute is only explicitly given where it isrequired.
Zur Laufzeit braucht wieder nur der erste Summandberechnet werden. Dafür muss also Code generiertwerden. Bei der schrittweisen Berechung kann aucheine Bereichsprüfung für das Feld vorgenommen werden.
Bemerkungen:
• Bei der Berechnung von Feldindizes gibt es häufigeine großes Potential für Optimierungen.
Translation of Imperative Language Constructs Translation of Variables and Data Types
Translation of Records
Translation of records is similar to translation of arrays:• Determine size and memory layout• Compute adresses for selection of record components and pointer
dereferencing• Translation of record operations, e.g. assignments to record
Translation of Imperative Language Constructs Translation of Variables and Data Types
Implementation of Dynamic Objects (2)
Example:
Implementierung dynamischer Objekte
Dynamische Objekte werden hier als Sammelbegriff fürDynamische Objekte werden hier als Sammelbegriff fürdynamisch allozierte Variable und Objekte im Sinne der OO-Programmierung verwendet.
Dynamische Objekte werden auf der Halde verwaltet:Dynamische Objekte werden auf der Halde verwaltet:
• Ihre Anzahl ist im Allg. zur Übersetzungszeit nicht
bekannt. Deshalb werden sie erst zur Laufzeit erzeugt.
• Sie haben eine Lebensdauer die eine kellerartigeSie haben eine Lebensdauer, die eine kellerartige
Translation of Imperative Language Constructs Translation of Variables and Data Types
Dynamic Memory Management
Dynamic memory management• is handled by runtime environment• can be supported by compiler• can partially be handled by user program
Runtime environment provides operations for dynamic memorymanagement:• for the programmer, e.g. in C malloc, calloc, realloc, free• for the compiler as in Pascal, Java, Ada• no memory deallocation by programer possible, but garbage
Translation of Imperative Language Constructs Translation of Variables and Data Types
Dynamic Memory Management (4)
I Return pointer to memory cell after header (size information has tobe kept.)
I If no memory area of required size is found, new memory has to berequested from the OS
• Free memoryI Find header for memory area to be freed by pointer to this areaI If previous or next memory areas are free, join the areasI Add resulting memory area to list
Translation of Imperative Language Constructs Translation of Variables and Data Types
Dynamic Memory Management (5)
Remarks:
• If program writes over assigned memory area, references or sizeinformation can be destroyed with bad consequences.
• If memory cannot be allocated in bytes, alignment restrictionshave to be obeyed.
• For practical use the above principle can be improved byI non linear searchI search for exact memory areas, avoiding defragmentationI support for joining memory areas after deallocation
Translation of Imperative Language Constructs Translation of Expressions
Improvements
• Improvement of generated code byI Storage of intermediate results in registersI Context-dependent optimizing instruction selectionI Avoiding redundant computations by evaluating common
subexpressions only once
• Improvement of translation technique by usage of intermediatelanguage
Translation of Imperative Language Constructs Translation of Statements
Translation of Statements
Most statements can be translated by translation schemes with jumps:
Verbesserungen:
• des erzeugten Codes durch
Verwaltung von Zwischenergebnissen in Registern- Verwaltung von Zwischenergebnissen in Registern- kontextabhängige, optimierende Befehlsauswahl- Vermeidung redundanter Berechnungen durch
einmalige Auswertung gemeinsamer Teilausdrücke
Ü
3 1 5 Übersetzung von Anweisungen
• der Übersetzungstechnik durch Benutzung einer
Zwischensprache
Für die meisten Anweisungen lassen sich relativ leicht Übersetzungsschemata mittels Sprüngen angeben:
3.1.5 Übersetzung von Anweisungen
While
[ Label( “BEGWHILE_“ + M ) ] +CE + [ Cmp( W Imm(0) Postinc(SP) ) ] +
Translation of Imperative Language Constructs Translation of Statements
More Complex Translation of Statements
More complex is a good translation of switch-statements and efficienthandling of non-strict expressions.
We consider the translation of non-strict Boolean expressions as anexample of an optimizing translation and for the usage of contextinformation.
Example: Abstract Syntax
Wir demonstrieren hier die Übersetzungnicht-strikter boolescher Ausdrücke:
• als Beispiel für eine optimierende Übersetzung
• um die Verwendung von Kontextinformation zu
illustrieren.
Beispiel: (Verwendung ererbter Information)
Stat = While | IfThenElse | ...
BExp = And | Or | Not | StrictExp
Beispiel: (Verwendung ererbter Information)
Wir betrachten folgendes Sprachfragment:
BExp And | Or | Not | StrictExp
While ( BExp c, Stat b )
IfThenElse ( BExp c, Stat then, Stat else )
And, Or ( BExp left, BExp right )
Not ( Bexp e )
StrictExp ( Exp e )
Ein Programmfragment dazu:
if( (B1 || B2) && ! B3 ) {
while( !(B4 || B5) ) A1
Wobei A1 und A2 Anweisungen sind und B1 bis B5
while( !(B4 || B5) ) A1
} else {
A2
}
Wobei A1 und A2 Anweisungen sind und B1 bis B5strikte Ausdrücke. Wie in C und Java sind die booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!
Translation of Imperative Language Constructs Translation of Statements
More Complex Translation of Statements (2)
A program fragment:
Wir demonstrieren hier die Übersetzungnicht-strikter boolescher Ausdrücke:
• als Beispiel für eine optimierende Übersetzung
• um die Verwendung von Kontextinformation zu
illustrieren.
Beispiel: (Verwendung ererbter Information)
Stat = While | IfThenElse | ...
BExp = And | Or | Not | StrictExp
Beispiel: (Verwendung ererbter Information)
Wir betrachten folgendes Sprachfragment:
BExp And | Or | Not | StrictExp
While ( BExp c, Stat b )
IfThenElse ( BExp c, Stat then, Stat else )
And, Or ( BExp left, BExp right )
Not ( Bexp e )
StrictExp ( Exp e )
Ein Programmfragment dazu:
if( (B1 || B2) && ! B3 ) {
while( !(B4 || B5) ) A1
Wobei A1 und A2 Anweisungen sind und B1 bis B5
while( !(B4 || B5) ) A1
} else {
A2
}
Wobei A1 und A2 Anweisungen sind und B1 bis B5strikte Ausdrücke. Wie in C und Java sind die booleschen Ausdrücke || und && nicht-strikt, d.h. z.B.dass bei Auswertung von B1 und B2 zu false, B3 nicht mehr ausgewertet werden braucht und darf!
Translation of Imperative Language Constructs Translation of Statements
More Complex Translation of Statements (3)
In C and Java, we have that || and && are non-strict, i.e. if B1 and B2evaluate to false, B3 may not be evaluated.
Further, jump cascades should be avoided, i.e. jumps to otherunconditional jumps.
Idea for Attribution:For each boolean expression, compute• Label for true case (Attribute: 5)• Label for false case (Attribute: 4)• Information of type bool in which case to jump (Attribute: �)
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Translation of Procedures and Local Objects
Most procedural languages support recursion, procedure-localvariables and nested procedures. In the following, we consider• Translation of recursive procedures• Translation of local variables• Translation of nested procedures
We do not consider the translation of procedures as parameters.
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Procedures
The declaration of a procedure consists of• the name of the procedure• the declaration of the formal parameters• the declaration of local variables• the body of the procedure
Each dynamic call of a procedure corresponds to a procedureincarnation.
Analogy:• Procedure declaration→ procedure incarnation• Class declaration→ object/class instance
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Translation of Recursive Procedures
Main Tasks:• Parameter passing on entry, return of result at exit of procedure• Addressing of parameters• Handling of recursion
Main Idea:For each procedure incarnation, a stack frame is allocated. The stackframe contains:• the current parameters• the return address• the register contents of the caller• further information
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Code Generation for Procedures
Code has to be generated• at the call site
I to pass current parameters to procedure incarnationI to jump to the code of the procedure bodyI to make the procedure’s result available for further processing
• at the beginning of the procedure (prolog)I saving registersI set argument pointer
• at the end of the procedure (epilog)I restore registers
Note: Many tasks can be moved from the call site to the prolog andvice versa. Because a procedure has only one prolog, but potentiallymany call sites, it is more efficient to move the code to the prolog (andto the epilog).
Übersetzungsschema für Prozeduraufruf wobeiÜbersetzungsschema für Prozeduraufruf, wobei vorausgesetzt ist, dass der Code für die Liste der Parameterausdrücke (ExpList) das Kellernder aktuellen Parameter besorgt:
Call
CPL +[ Jump PLAB ] +< entfernen der Parameter vom Keller >
Übersetzungsschema für Prozeduraufruf wobeiÜbersetzungsschema für Prozeduraufruf, wobei vorausgesetzt ist, dass der Code für die Liste der Parameterausdrücke (ExpList) das Kellernder aktuellen Parameter besorgt:
Call
CPL +[ Jump PLAB ] +< entfernen der Parameter vom Keller >
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Translation of Procedure-Local Variables
Analogue to parameters, also procedure-local variables have to bestored in the stack frame, because there is one instance of the localvariables for each procedure incarnation.
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Dynamic and Static Local Variables
Local Variables are static, if their size is known at compile time, elsethey are dynamic.
Example:
Lokale Variablen heißen statisch, wenn ihreGröße zur Übersetzungszeit bekannt ist, andernfallsdynamisch.
Beispiel: (statische/dynamische Variable)
Im folgenden C-Fragment sind i,j,k statische lokaleVariable; f und g sind dynamische Variable/Felder
void foo( int hsize ) {
int i, j;
Variable; f und g sind dynamische Variable/Felder,da ihre Größe vom Parameter size abhängt.
char f[ 2*hsize ];
int g[ hsize ];
int k;
...
}}
Speicherallokation geschieht im Prolog, bei dynamischen Variablen in Abhängigkeit von denaktuellen Parametern Übersetzer erzeugt dafür Codeaktuellen Parametern. Übersetzer erzeugt dafür Code.
Adressierung:
Prozedurlokale Variable werden relativ zu einem
Bezugspunkt im Kellerrahmen adressiert, z.B. relativ zum Argumentzeiger.
Bei der Adressierung dynamischer Variablen ist
im Allg ein zusätzlicher Indirektionsschritt notwendig
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Relevant aspects for code generation
1. Addressing with static predecessor reference chain:
Let V be a variable with PND(V) = n, i.e. V is declared as a localvariable of a procedure P with PND(P) = n. Let RA(V) be the addressof V relative to the the argument pointer.
Let VA be an application position of V in a procedure Q (6= P) withPND(Q) = m and m > n.
The address of VA is obtained by m − n times dereferencing of thestatic predecessor references:
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Relevant aspects for code generation (2)
Remark:• The difference m-n is known at compile time for each application
position of a variable.
• The address of VA can in general not be handled directly by theaddressing techniques of the target machine. Instead, separatecommands have to be used that are executed each time thevariable is accessed.
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Local display
1. Addressing with local display
Let V, n, RA(V), VA and m defined as above, and m >n . The addressof VA is obtained by:
M[AP − 4 ∗ (m − n)] + RA(V )
2. Management of the local display:
Let ∆ PND =def PND(caller) -PND (callee). We distinguish two cases:1. ∆ PND = -1: Display of caller + AP of caller2. ∆ PND > -1: Display of caller - ∆ PND Entries
Remarks:• Addressing of local variables is more efficient with local display.• In general, more memory space on stack is required.
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Global display
1. Addressing with global display
Addressing with global display is like addressing with local display, butinstead of AP the address of global display is used.
2. Management of the global display:
Problem: Global display is changed on a procedure call if procedureswith lower PND are executed that are later called by procedures withhigher PND.
Observation: Each procedure incarnation changes maximally onecomponent of the global display, i.e. if PND(caller) - PND (callee) = -1.
Solution: It suffices to save the changed component and to restore itin the epilog of a procedure. For saving the component, a memoryword in the stack frame has to be reserved.
Translation of Imperative Language Constructs Translation of Procedures and Local Objects
Global display (2)
Remarks:
• If there are enough registers, the global display (or parts) shouldbe stored in registers.
• For languages that use procedures as parameters, the displaytechnique has to be adapted.
• The different variants for handling nested procedures show typicalvariation points in compiler design.
The introduced memory management can be seen as a schema thatcan be adapted for given source and target languages (consideringproperties of the target machines, e.g. caches).