Top Banner
Intermediate Code Generation Mooly Sagiv [email protected] Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/ courses/wcc02.html Chapter 7 (Chapter 6 next week)
27

Intermediate Code Generation Mooly Sagiv [email protected] Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Intermediate Code GenerationMooly Sagiv

[email protected] 31703-640-7606

Wed 10:00-12:00

html://www.math.tau.ac.il/~msagiv/courses/wcc02.htmlChapter 7

(Chapter 6 next week)

Page 2: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Basic Compiler PhasesSource program (string)

Fin. Assembly

lexical analysis

syntax analysis

semantic analysis

Translate

Instruction selection

Register Allocation

Tokens

Abstract syntax tree

Intermediate representation

Assembly

Page 3: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Why can’t we translate directly into machine language

Page 4: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Why use intermediate languages?• Simplify the compilation phase

– ultimately leads to a more efficient code

• Portability of the compiler front-end

• Reusability of the compiler back-end

Java

C

Pascal

C++

ML

Pentium

MIPS

Sparc

Java

C

Pascal

C++

ML

Pentium

MIPS

Sparc

IR

Page 5: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

IR Design Goals• Convenient to generate IR from the source

• Convenient to generate machine code from IR– Missmatches between Source and Target

• Clear operational meaning

Textbook Solution

• Simple intermediate instructions

•Tree like expressions

Page 6: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

A Grammar for the Tree IRT_stm ::= T_stm T_stm (T_SEQ)

T_stm ::= T_label (T_LABEL)

T_exp ::=T_exp (T_MEM)

T_stm ::= T_exp Temp_labelList (T_JUMP)

T_stm::= T_relop T_exp T_exp Temp_label Temp_label (T_CJUMP)

T_stm::=T_exp T_exp (T_MOVE)

T_stm ::= T_exp (T_EXP)

T_exp ::=T_binop T_exp T_Exp (T_BINOP)

T_exp ::= Temp_temp (T_TEMP)

T_exp ::= T_stm T_exp (T_ESEQ)

T_exp ::= Temp_label (T_LABEL)

T_exp ::=int (T_CONST)

T_exp::= T_exp T_expList (T_CALL)

Page 7: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

/* tree.h */typedef struct T_exp_ *T_exp;struct T_stm_ { enum {T_SEQ, T_LABEL, T_JUMP, …, T_EXP} kind; union { struct {T_stm left, right;} SEQ;

… } u;};

T_stm T_Seq(T_stm left, T_stm right);T_stm T_Label(Temp_label);T_stm T_Jump(T_exp exp, Temp_labelList labels);T_stm T_Cjump(T_relOp op, T_exp left, T_exp right, Temp_label _true, Temp_label _false );T_stm T_Move(T_exp, T_exp);T_stm T_Exp(T_exp);typedef enum {T_plus, T_minus, T_mul, T_div, T_and, T_or, T_lshift, T_rshift, T_arshift, T_xor} T_binOp ;typedef enum {T_eq, T_ne, T_lt, T_gt, T_le, T_ge, T_ult, T_ule, T_ugt, T_uge} T_relOp;struct T_exp_ { enum {T_BINOP, T_MEM, T_TEMP, …, T_CALL} kind;

union {struct {T_binop op; T_exp left; T_exp right;} BINOP; …} u; } ;

Page 8: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Example factorial

let function nfactor (n: int): int = if n = 0 then 1 else n * nfactor(n-1)in nfactor(10)end

Page 9: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Abstract Tiger ProgramletExp(decList( functionDec(fundecList( fundec(nfactor, fieldList( field(n, int, fld-escaped=FALSE), fieldList()), int, ifExp( opExp(EQUAL, varExp(simpleVar(n)), intExp(0)), intExp(1), opExp(TIMES, varExp(simpleVar(n)), callExp(nfactor, expList(opExp(MINUS, varExp(simpleVar(n)), intExp(1)), expList()))))), fundecList())), decList()), seqExp(expList( callExp(nfactor, expList(intExp(10), expList())), expList())))

Page 10: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

IR for Main

/* prologue of main starts with l1 *//* body of main */MOV(TEMP(RV), CALL(NAME(l2), ExpList(CONST(10), null /* next argument */)))/* epilogue of main */

Page 11: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

IR for nfact/* Prologue of nfunc starts with l2 *//* body of nfunc */MOV(TEMP(RV), ESEQ(SEQ( CJUMP(=, “n”, CONST(0), NAME(l3), NAME(l4)), SEQ(LABEL(l3) /* then-clause */, SEQ(MOV(TEMP(t1), CONST(1)), SEQ(JUMP(NAME(l5)), SEQ(LABEL(l4), /* else-clause */ SEQ(MOV(TEMP(t1), BINOP(MUL, “n”, CALL(NAME(l2), ExpList(BINOP(MINUS, “n”, CONST(1)), null /* next argument */)))), LABEL(l5)))…), TEMP(t1)))/* epilogue of nfunc */

Page 12: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Outline of the Translation (translate.c)

• Top-down traversal over the abstract syntax tree• Generate code to allocate memory for declarations and

initializations (next week)• Generate code for function declarations:

– Prologue– The body expression– Epilogue

• Generate code for expressions– Value expressions

• x + y

– Location expressions • x < y

• Statements– x := y– Control flow

Page 13: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

The rest of this lecture• L-values and R-Values• Arithmetic expressions• Conditionals and Loops• Conversions• Complex data types

– Arrays

– Structures

• Memory Checks

Page 14: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

L-values vs. R-values

• Assignment x := exp is compiled into:– Compute the address of x

– Compute the value of exp

– Store the value of exp into the address of x

• Generalization– R-value

– L-value

rval(y) + rval(x) = y)+(x rval

5 = (5) rval

xof value= (x) rval

lval(*e)

rval(e) + lval(a) = (a[e]) lval

a of address base = a)array -Pascal lval(

a of address = a)pointer -lval(C

undefined = a)array -lval(C

undefined = y)+(x lval

undefined = (5) lval

xof address = (x) lval

Page 15: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Translating Expressions• Straightforward by induction on the abstract

expression tree

/* translate.c */Tr_exp Tr_opExp(A_oper oper, Tr_exp left, Tr_exp right){ switch (oper) { case A_plusOp: return Tr_opArithExp(T_plus, left, right); case A_minusOp: return Tr_opArithExp(T_minus, left, right); case A_timesOp: … case A_eqOp: return Tr_opCondExp(T_eq,left,right); case A_neqOp: return Tr_opCondExp(T_ne,left,right); case A_ltOp: … } assert(0); return NULL;}

Page 16: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Conditional Expressions• Translating Expressions in Conditions may

be tricky

• Two options– Value computation

• Compute a value of Boolean Expression

– Location computation• Compute a label in the code that will be reached if

the expression holds

• Allows shortcut computations

Page 17: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Example C code• if (a < 6 && b+1 >7)

a = b * c

CJUMP(<, “a” CONST(6), l1, l2)

LABEL(l1)

CJUMP(>, (BINOP(+, “b”, CONST(1)), CONST(7), l3, l2)

LABEL(l3)

MOVE(“a”, BINOP(*, “b”, “c”)

LABEL(l2)

Page 18: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Conditional Expressions in Tiger

static Tr_exp Tr_opCondExp( T_relOp oper,

Tr_exp left,

Tr_exp right)

{

struct Cx cx;

cx.stm = T_Cjump(oper, left, right, NULL, NULL);

cx.trues = PatchList(cx.stm->u.CJUMP._true, NULL);

cx.falses = PatchList(cx.stm->u.CJUMP._false, NULL);

return Tr_Cx(cx.trues, cx.falses, cx.stm);

}

if a >b then x := 5 SEQ

CJUMP

GT “a” “b”

t

NAME

f

NAME

SEQ

SEQ

Code for x:=5

t

LABEL

LABEL

f

Page 19: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Loops• Similar to if-then else

• Need to handle break

while a >b do S

Page 20: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Conversions

• Local translation may lead to converting representations – Value-computation Location-computation

• Examplesif (x+5) then 0 else 1

(a > b) + b

x := if (a>b) then a else b

x := (a > b)

(if a>b then a else b) + 1

Page 21: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Complex Data Types

• Data types like arrays, strings, and records may require special treatment

• Important questions– Duration– Static vs. Dynamic size– Structured L-values

Page 22: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Complex Data Types in Tiger• Arrays, strings, and record’s fields are long-lived

– Usually allocated in the heap

– No structured L-values

• Example: Tiger Record Allocation

type foo = { a : ty1 , b : ty2}... = foo {a =e1, b = e2}

ESEQ (SEQ ( MOV(TEMP r, CALL(NAME MALLOC, CONST 2*W)), SEQ( MOV(MEM(+(0*W, TEMP r)), TransExp(e1))), MOV(MEM(+(1*W, TEMP r)), (TransExp(e2))))), TEMP r)

Page 23: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Example Tiger Arrayslet type intArray = array of int var a := intArray[12] of 0 var b := intArray[13] of 7in a := b

SEQ( SEQ( CONST 0, SEQ( MOVE(TEMP ta, CALL(NAME initArray, CONST 12, CONST 0)), SEQ( MOVE(TEMP tb, CALL(NAME initArray, CONST 13, CONST 7)), MOVE(TEMP ta, TEMP tb)))))

Page 24: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

L-values of Arrays and Structures(Tiger)

• The l-value of a[i] MEM(+(“a”, *(CONST W, “i”)))

• For a structure s.f MEM(+(“s”, *(CONST W, CONST kf)))

Page 25: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Big L-values

• In some programming languages, more than one word need to be copied or stored

• Examples: – C structures– Pascal arrays

• How can this be handled?

Page 26: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Memory checks• Can the compiler guarantee that no invalid memory is

referred– At compile-time– At runtime?

• Examples– Array references

• Algol, Pascal, Java, PL.1– Runtime checks

• C – No checks

• Ada, C#– User control

– Field and pointer dereferences• The best solutions combine runtime and compile-time

checks

Page 27: Intermediate Code Generation Mooly Sagiv msagiv@post.tau.ac.il Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc02.html.

Summary• Intermediate code simplifies the translation

and increases re-use

• Tree-like intermediate code simplifies the translation of expressions– No temporaries

• Abstract syntax helps

• Memory management is interesting– Mostly next week