Top Banner
COMPILER Chapter# 5 Intermediate code generation
27

Chapter# 5 Intermediate code generation. In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

Jan 01, 2016

Download

Documents

Reynard Lucas
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

COMPILER Chapter# 5

Intermediate code generation

Page 2: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

INTERMEDIATE CODE GENERATION

In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back-end generates target code.

Intermediate codes are machine independent codes, but they are close to machine instructions.

The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator.

Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language.

Page 3: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

INTERMEDIATE CODE GENERATION (CONTINUE……..)

o Syntax trees can be used as an intermediate language.o Postfix notation can be used as an intermediate language.o Three-address code (Quadruples) can be used as an

intermediate language We will use quadruples to discuss intermediate code

generation Quadruples are close to machine instructions, but they

are not actual machine instructions.

some programming languages have well defined intermediate languages, for example: Java: Java virtual machine (JVM)

In fact, there are byte-code to execute instructions in these intermediate languages.

Page 4: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

VARIANTS OF SYNTAX TREES:

Nodes in a syntax tree represent constructs in the source program.

The children of a node represent the meaningful components of a construct.

A directed acyclic graph (DAG) for an expression identifies the common sub-expressions (sub-expression that occur more than once) of the expression.

DAG’s can be constructed by using the same techniques that construct syntax tree.

Page 5: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

5

Cyclic Graphs for Expressions

a + a * ( b – c ) + ( b – c ) * d

+

+ *

*-a

b c

d

Page 6: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

DIRECTED ACYCLIC GRAPHS FOR EXPRESSIONS:

Like the syntax tree for an expression, a DAG has leaves corresponding to atomic operands and interior codes corresponding to operators.

The difference is that a node N in a DAG has more than one parent if N represents a common sub-expression.

In a syntax tree, the tree for the common sub-expression would be replicated as many times as the sub-expression appears in the original expression.

Thus DAG not only represents expressions more succinctly, it gives the compiler important clues regarding the generation of efficient code to evaluate the expressions.

Page 7: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

DIRECTED ACYCLIC GRAPHS FOR EXPRESSIONS: (CONTINUE….)

The leaf for a has two parents, because a appears twice in the expression.

The two components of the common sub-expression b-c are represented by one node, the node labeled -.

That node has two parents, representing its two uses in the sub-expression a*(b-c) and (b-c)*d.

Page 8: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

THREE-ADDRESS CODE In three-address code, there is at most one operator

on the right side of an instruction. Thus a source language expression like x+y*z

might be translated into the sequence of three-address instructions

t1 = y * z

t2 = x + t1 Where t1 and t2 are compiler-generated temporary

names.

Page 9: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

ADDRESSES AND INSTRUCTIONS: Three-address code is built from two concepts: address and

instructions. Three-address code can be implemented using records with

fields for the addresses, records are called quadruples and triples.

An address can be one of the following: A name: For convenience, we allow source-program names to

appear as addresses in three-address code. A constant: In practice, a compiler must deal with many different

types of constants. A compiler-generated temporary: It is useful, especially in

optimizing compilers, to create a distinct name each time a temporary is needed.

Page 10: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

ADDRESSES AND INSTRUCTIONS: (CONTINUE……….)

We now consider the common three-address instructions. Assignment instructions of the form x = y op z

where op is a binary arithmetic or logical operation, and x,y and z are addresses.

Assignment of the form x = op y, where op is a unary operation.

Copy instruction of the form x=y, where x is assigned the value of y.

Page 11: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

QUADRUPLES: The description of three-address instruction specifies the

components of each type of instruction. In a compiler, these instructions can be implemented as

objects or as records with fields for the operator and the operands.

Three such representations are called “quadruples”, “triples” and “indirect triples.”

A quadruple has four fields, which we call op, arg1, arg2, and result.

The op field contains an internal code for the operator. For instance the three-address instruction x=y+z is represented

by placing + in op, y in arg1, z in arg2, and x in result.

Page 12: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

QUADRUPLES: (CONTINUE…….)

The following are some exceptions to this rule:

1. Instructions with unary operators like x= minus y or z=y do not use arg2.

2. Conditional and unconditional jumps put the target label in result.

Page 13: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

Three address code for the assignment

a = b * -c + b * -c The special operator minus is used to distinguish

the unary operator, as in –c, from the binary minus operator, as in b-c.

Note that the unary-minus “three-address” statement has only two addresses, as does the copy statement a=t5

.

The quadruples in figure(b) implement the three-address code in (a) as follows:

Page 14: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

(a) Three-address code (b) Quadruples

t1 = minus c

t2 = b * t1

t3 = minus c

t4 = b * t3

t5 = t2 + t4

a = t5

op arg1 arg2

result

0 minus

c t1

1 * b t1 t2

2 minus

c t3

3 * b t3 t4

4 + t2 t4 t5

5 = t5 a

Three address code for the assignmenta = b * -c + b * -c

continue………

Page 15: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TRIPLES: A triple has only three fields, which we call op, arg1, arg2. Note that the result field in previous figure is used primarily

for temporary names. Using triples we refer the result of an operation x op y by its

position, rather than by an explicit temporary name. Thus, instead of the temporary t1 in Figure((b)previous

slide), a triple representation would refer to position (0). Parenthesized numbers represent pointers into the triple

structure itself. Positions or pointers to positions were called value numbers.

Page 16: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

Representations of a = b * -c + b * -c

Syntax tree (b) Triples

=

a +

* *

b minus b minus

c c

op arg1 arg2

0 minus

c

1 * b (0)

2 minus

c

3 * b (2)

4 + (1) (3)

5 = a (4)

Page 17: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TYPES AND DECLARATIONS:

TYPES: The applications of types can be grouped under checking

and translation: Type checking uses logical rules to reason about the behavior

of a program at runtime. Specially, it ensures that the types of operands match the type expected by an operator. For example, the && operator in Java expects its two operands to be Boolean, the result is also type Boolean.

Translation: From the type of a name, a compiler can determine the storage that will be needed for that name at run time. Type information is also needed to calculate the address denoted by an array reference.

Page 18: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TRANSLATION OF EXPRESSIONS: We begin in this section with the translation of

expressions into three-address code. An expression with more than one operator, like

a+b*c, will translate into instructions with at most one operator per instruction.

An array reference A[i][j] will expend into a sequence of three-address instructions that calculate an address for the reference.

Page 19: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

ADDRESSING ARRAY ELEMENTS: Array elements can be accessed quickly if they are

stored in a block of consecutive locations. In C and Java, array elements are numbered 0,1,2,

……,n-1 for an array with n elements. If the width of each array element is w, then the ith

element of array ‘A’ begins in location

base + i x w Base is the relative address of A[0].

Page 20: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TYPE CHECKING: To do type checking a compiler needs to assign a

type expression to each component of the source program.

The compiler must then determine that these type expression conform to a collection of logical rules that is called the type system for the source program.

Type checking has the potential for catching errors in programs.

Page 21: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TYPE CONVERSIONS: Consider expressions like x + i, where x is of type float and i is

of type integer. Since the representation of integers and floating-point numbers

is different within a computer and different machine instructions are used for operations on integers and floats.

The compiler may need to convert one of the operands of + to ensure that both operands are of the same type when addition occurs.

Suppose that integers are converted to floats when necessary, using a unary operator (float).

For example, the integer 2 is converted to a float in the code for the expression 2 * 2.14

t1 = (float) 2

t2 = t1 * 3.14

Page 22: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

CONTROL FLOW: The translation of statements such as if-else-

statements and while-statements is tied to the translation of Boolean expressions.

In programming languages, Boolean expressions are often used to

1. Alter the flow of control: Boolean expressions are used as conditional expressions in statements that alter the flow of control.

2. Compute logical values. A Boolean expression can represent true or false as values.

Page 23: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

BOOLEAN EXPRESSIONS: Boolean expressions are composed of the Boolean

operators (which we denote &&, ||, and !, using the C convention for the operators AND, OR, NOT, respectively) applied to elements that are Boolean variables or relational expressions.

Relational expressions are of the form E1 rel E2, where E1 and E2 are arithmetic expressions.

B B || B | B && B | !B | (B) | E rel E | true | false We use the attribute rel.op to indicate which of the

six comparison operators <, <=, ==, !=, >, >= is represented by rel.

Page 24: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

BOOLEAN EXPRESSIONS: (CONTINUE……)

Given the expression B1 || B2, if we determine that B1 is true, then we can conclude that the entire expressions is true without having to evaluate B2.

Similarly, given B1 && B2, if B1 is false, then the entire expression is false.

Page 25: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

SWITCH STATEMENTS: The “switch” or “case” statement is available in a variety of

languages. Our switch statement syntax is given bellow:

switch( E )

{

case V1 : S1

case V2 : S2

………..

case Vn-1 : Sn-1

default : Sn

} There is a selector expression E, switch is to be evaluated, followed

by n constant values V1,V2,…Vn that the expression might take, perhaps including a default “value”, which always matches the expression if no other value does.

Page 26: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

TRANSLATION OF SWITCH-STATEMENT The intended translation of a switch is code

to:1. Evaluate the expression E.2. Find the value Vj in the list of cases that is the

same as the expression. 3. Execute the statement Sj associated with the

value found.

Page 27: Chapter# 5 Intermediate code generation.  In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate.

THE END