Top Banner
Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1
91

Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Jan 05, 2016

Download

Documents

Baldwin Randall
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

1

Theory of Compilation 236360

Erez Petrank

Lecture 11: Optimizations

Page 2: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

2

Register Allocation by Graph Coloring

• Address register allocation by– liveness analysis– reduction to graph coloring– optimizations by program transformation

• Main idea– register allocation = coloring of an interference graph– every node is a variable– edge between variables that “interfere” = are both live at the

same time– number of colors = number of registers

Page 3: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

3

Example

v1

v2

v3

v4

v5

v6

v7

v8

time

V1

V8

V2

V4

V7

V6

V5

V3

Page 4: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Example

4

a = read();b = read();c = read();a = a + b + c;if (a<10) { d = c + 8; print(c);} else if (a<2o) { e = 10; d = e + a; print(e);} else { f = 12; d = f + a; print(f);} print(d);

a = read();b = read();c = read();a = a + b + c;if (a<10) goto B2 else goto B3

d = c + 8;print(c); if (a<20) goto B4

else goto B5

e = 10;d = e + a;print(e);

f = 12;d = f + a;print(f);

print(d);

B1

B2B3

B4B5

B6

b

ac

d

ef

dd

Page 5: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Example: Interference Graph

5

f a b

d e c

a = read();b = read();c = read();a = a + b + c;if (a<10) goto B2 else goto B3

d = c + 8;print(c); if (a<20) goto B4

else goto B5

e = 10;d = e + a;print(e);

f = 12;d = f + a;print(f);

print(d);

B2B3

B4B5

B6

b

ac

d

ef

dd

Page 6: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

6

Register Allocation by Graph Coloring

• variables that interfere with each other cannot be allocated the same register

• graph coloring– classic problem: how to color the nodes of a graph with the

lowest possible number of colors– bad news: problem is NP-complete (to even approximate)– good news: there are pretty good heuristic approaches

Page 7: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

7

Heuristic Graph Coloring

• idea: color nodes one by one, coloring the “easiest” node last

• “easiest nodes” are ones that have lowest degree– fewer conflicts

• algorithm at high-level– find the least connected node– remove least connected node from the graph– color the reduced graph recursively– re-attach the least connected node

Page 8: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

8

Heuristic Graph Coloring

f a b

d e c

f a

d e c

f a

d e

f

d e

stack: stack: b

stack: cb stack: acb

Page 9: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Heuristic Graph Coloring

9

f

d e

stack: acb

f

d

stack: eacb

f

stack: deacb stack: fdeacb

f1

stack: deacb

f1

d2

stack: eacb

f1

d2 e1

stack: acb

f1 a2

d2 e1

stack: cb

Page 10: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

10

Heuristic Graph Coloring

f1 a2 b3

d2 e1 c1

f1 a2

d2 e1 c1

f1 a2

d2 e1

stack:

stack: bstack: cb

Result:3 registers for 6 variables

Can we do with 2 registers?

Page 11: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

11

Heuristic Graph Coloring

• two sources of non-determinism in the algorithm– choosing which of the (possibly many) nodes of lowest degree

should be detached– choosing a free color from the available colors

Page 12: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

12

Heuristic Graph Coloring• The above heuristic gives a coloring of the graph. • But what we really need is to color the graph with a given

number of colors = number of available registers. • Many times this is not possible. • We’d like to find the maximum sub-graph that can be colored. • Vertices that cannot be colored will represent variables that will

not be assigned a register.

Page 13: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Similar Heuristic1. Iteratively remove any vertex

whose degree < k (with all of its edges).

2. Note: no matter how we color the other vertices, this one can be colored legitimately!

V1

V8

V2

V4

V7

V6

V5

V3

Page 14: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Similar Heuristic1. Iteratively remove any vertex

whose degree < k (with all of its edges).

2. Note: no matter how we color the other vertices, this one can be colored legitimately!

V1

V8

V2

V4

V7

V6

V5

V3

.4Now all vertices are of degree >=k (or graph is empty).5If graph empty: color the vertices one-by-one as in previous

slides. Otherwise ,.6Choose any vertex, remove it from the graph. Implication:

this variable will not be assigned a register. Repeat this step until we have a vertex with degree <k and go back to (1) .

Page 15: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Similar Heuristic1. Iteratively remove any vertex

whose degree < k (with all of its edges).

2. Note: no matter how we color the other vertices, this one can be colored legitimately!

V1

V8

V2

V4

V7

V6

V5

V3

.4Now all vertices are of degree >=k (or graph is empty).5If graph empty: color the vertices one-by-one as in previous

slides. Otherwise ,.6Choose any vertex, remove it from the graph. Implication:

this variable will not be assigned a register. Repeat this step until we have a vertex with degree <k and go back to (1) .

Source of non-determinism: choose which vertex to remove in (6).This decision determines the number of spills.

Page 16: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

16

Summary: Code Generation

• Depends on the target language and platform. – GNU Assembly– IA-32 platform.

• Basic blocks and control flow graph show program executions paths.

• Determining variable liveness in a basic block. – useful for many optimizations. – Most important use: register allocation.

• Simple code generation. • Better register allocation via graph coloring heuristic.

Page 17: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

17

Optimizations

Page 18: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

18

Optimization• Improve performance. • Must maintain program semantics

– optimized program must be equivalent.

• In contrast to the name, we seldom obtain optimum. • We do not improve an inefficient algorithm, we do not fix bugs. • Classical question: how much time should we spend on compile-

time optimizations to save on running time? – With parameters unknown.

• Optimize running time (most popular), • Optimize size of code,• Optimize memory usage,• optimize energy consumption.

Page 19: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

19

Where does inefficiency come from? • Redundancy in original program:

– Sometimes the programmer uses redundancy to make programming easier, knowing that the compiler will remove it.

– Sometimes the programmer is not very good.

• Redundancy because of high level language: – E.g., accessing an array means computing i*4 inside a loop repeatedly.

• Redundancy due to translation.– The initial compilation process is automatic and not very clever.

Page 20: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

20

Running Time Optimization

• Need to understand how the run characteristics (which are often unknown). – Usually the program spends most of its time in a small part of

the code. if we optimize that, we gain a lot. – Thus, we invest more in inner loops. – Example: place together functions with high coupling.

• Need to know the operating system and the architecture. • We will survey a few simple methods first, starting with

building a DAG.

Page 21: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

21

Representing a basic block computation with a DAG

• Leaves are variable or constants, marked by their names of values. • Inner vertices are marked by their operators. • We also associate variable names with the inner vertices according to the

computation advance.

t1 := 4 * it2 := a [ t1 ]t3 := 4 * it4 := b [ t3 ]t5 := t2 * t4

t6 := prod + t5

prod := t6

t7 := i + 1i := t7

if i <= 20 goto (1)

(1)

1i04ba

20t7, i+t1, t3 *

(1)<=t4[ ]t2[ ]

t5*prod0

t6, prod+

Page 22: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

22

Building the DAG

For each instruction x: = y + z• Find the current location of y and z, • Build a new node marked “+” and connect as a parent to both nodes

(if such parent does not exist); associate this node with “x”• If x was previously associated with a different node, cancel the

previous association (so that it is not used again). • Do not create a new node for copy assignment such as x := y.

Instead, associate x with the node that y is associated with. – Such assignments are typically eliminated during the optimization.

Page 23: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Using the DAG

23

t1 := 4 * it2 := a [ t1 ]t3 := 4 * it4 := b [ t3 ]t5 := t2 * t4

t6 := prod + t5

prod := t6

t7 := i + 1i := t7

if i <= 20 goto (1)

(1)

1i04ba

20t7, i+t1, t3 *

(1)<=t4[ ]t2[ ]

t5*prod0

t6, prod+

t1

prod := prod + t5

i := i + 1

Page 24: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

24

Uses of Dags

• Automatic identification of common expressions• Identification of variables that are used in the block• Identification of values that are computed but not used. • Identifying computation dependence (allowing code

movements) • Avoiding redundant copying instructions.

Page 25: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

25

Aliasing Problems• What’s wrong about the following optimization?

• The problem is with the side effect due to aliasing. • Typically, we conservatively assume aliasing: upon assignment

to an array element we assume no knowledge in array entries. • The problem is when we do not know if aliasing exists. • Relevant to pointers as well. • Relevant to routine calls when we cannot determine the routine

side-effects. • Aliasing is a major barrier in program optimizations.

x := a [ i ]a [ j ] := yz := x

x := a [ i ]a [ j ] := yz := a [ i ]

Page 26: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

26

Optimization Methods

• In the following slides we review various optimization methods, stressing performance optimizations.

• The main goal: eliminate redundant optimizations. • Some methods are platform dependent.

– In most platforms addition is faster than multiplication.

• Some methods do not look useful on their own, but their combination is effective.

Page 27: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

27

Basic Optimizations

• Common expression elimination: – We have seen how to identify common expressions in a basic

block and eliminate repeated computation. – We will later present data flow analysis that will help us find such

expressions across basic blocks.

• Copy propagation: – Given an assignment x:=y, we attempt to use y instead of x. – Possible outcome: x becomes dead and we can eliminate the

assignment.

Page 28: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

28

Code motion

• Code motion is useful in various scenarios. • Identify inner-loop code,• Identify an expression whose sources do not change in the loop,

and• Move this code outside the loop!

while (x – 3 < y) {// … instructions

that do // not change x}

t1 = x – 3;while (t1 < y) {

// … instructions that do // not change x or t1}

Page 29: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

29

Induction variables & Strength Reduction• Identify the loop variables, and their relation to other variables.• Eliminate dependence on induction variables as much as possible

(1) i = 0;

(2) t1 = i * 4;

(3) t2 = a[t1]

(4) if (t2 > 100) goto (19)

(5) …

(17) i = i + 1

(18) goto (2)

(19) …• Why is such code (including multiplication by 4) so widespread?

→ t1 = t1 + 4

t1 must be initialized outside the loop

Not just S.R.! We have removed dependence of t1 in i.

Thus, instructions 1 and 17 become irrelevant.

In many platforms addition is faster than multiplication (strength reduction)

Page 30: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

30

Peephole (חור הצצה) Optimization

• Going over long code is costly. • A simple and effective alternative (though not optimal) is

peephole optimization:• Check a “small window” of code and improve only this

code section.• Some instructions can be improved even without

considering their neighboring instructions. • For example:

– x := x * 1;– a := a + 0;

Page 31: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

31

peephole optimizations

• Some optimizations that do not require a global view: • Simplifying algebraic computations: • x := x ^ 2 → x := x * x• x := x * 8 → x := x << 3• Code rearrangement:(1) if x == 1 goto (3)(2) goto (19)(3) …

(1) if x 1 goto (19)(2) …

Page 32: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

32

peephole optimizations• Eliminate redundant instructions:

(1) a := x

(2) x := a

(3) a := someFunction(a);

(4) x := someOtherFunction(a, x);

(5) if a > x goto (2)

• Execute peephole optimizations within basic block only and do not elide the first instruction.

!זהירותפקודה אל קופץ מישהו אם

בעיה, נוצרת .שביטלנו

Page 33: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

i = m – 1 ; j = n ; v = a [ n ];while (1) {

void quicksort ( m , n )int m , n ; {

do i = i + 1 ; while ( a [ i ] < v ) ; do j = j – 1 ; while ( a [ j ] > v ) ;if ( i >= j ) break ; x = a [ i ] ; a [ i ] = a [ j ] ; a [ j ] = x ; }

int i , j ;int v , x ;if ( n <= m ) return ; code fragmentquicksort ( m , j ) ; quicksort ( i + 1 , n ) ; }x = a [ i ] ; a [ i ] = a [ n ] ; a [ n ] = x ;

t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto (5)t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto (5)j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto (9)if i >= j goto (23)t6 := 4 * ix := a [ t6 ]

(1)(2)(3)(4)(5)(6)(7)(8)(9)

(10)(11)(12)(13)(14)(15)

Page 34: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

i = m – 1 ; j = n ; v = a [ n ];while (1) {

void quicksort ( m , n )int m , n ; {

do i = i + 1 ; while ( a [ i ] < v ) ; do j = j – 1 ; while ( a [ j ] > v ) ;if ( i >= j ) break ; x = a [ i ] ; a [ i ] = a [ j ] ; a [ j ] = x ; }

int i , j ;int v , x ;if ( n <= m ) return ; code fragmentquicksort ( m , j ) ; quicksort ( i + 1 , n ) ; }x = a [ i ] ; a [ i ] = a [ n ] ; a [ n ] = x ;

t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto (5)t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto (5)j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto (9)if i >= j goto (23)t6 := 4 * ix := a [ t6 ]

(1)(2)(3)(4)(5)(6)(7)(8)(9)

(10)(11)(12)(13)(14)(15)

B1

B2

B4

B5

B1

B3

Page 35: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

Page 36: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

Page 37: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

Page 38: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

Page 39: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

Page 40: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

Page 41: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3

Page 42: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

Page 43: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

t2

Page 44: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

t2

t3

Page 45: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

t2

t3

t1

Page 46: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

t2

t3

t1

t1

Page 47: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

שלב א'

ביטול ביטויים משותפים באופן גלובלי

t4

t4

t5

t2

t2

t3t2

t2

t3

t1

t1

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

←←

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

Page 48: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

ביטול ביטויים משותפים באופן שלב א' -- גלובלי

שלב ב' –

copy propagation: with f:= g, we try to use g and get rid of f .

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

t3

Page 49: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

t3

Global common expression elimination

Copy propagation

Dead code elimination – eliminate redundant code .

Page 50: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

ביטול ביטויים משותפים באופן –שלב א' גלובלי

copy propagation –שלב ב'

dead code elimination שלב ג' –

(הוצאת ביטויים code motion – שלב ד'מחוץ ללולאה)

t3

Global common expression elimination

Copy propagation

Dead code elimination

Code motion – move expressions outside the loop

Page 51: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

ביטול ביטויים משותפים באופן –שלב א' גלובלי

copy propagation –שלב ב'

dead code elimination שלב ג' –

code motion – שלב ד'

induction variables and שלב ה' –reduction in strength זיהוי המשתנים של)

t4 := t4 – 4

→ t4 := 4 * j

t3

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

Page 52: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

ביטול ביטויים משותפים באופן –שלב א' גלובלי

copy propagation –שלב ב'

dead code elimination שלב ג' –

code motion – שלב ד'

t4 := t4 – 4

t2 := t2 + 4

t2 := 4 * i

t3

→ t4 := 4 * j

Global common expression elimination

Copy propagation

Dead code elimination

Induction variables and strength reduction

Common expression elimination

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

Page 53: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

t4 := t4 – 4

t2 := t2 + 4

t2 >= t4

t3

t2 := 4 * i→ t4 := 4 * j

Global common expression elimination

Copy propagation

Dead code elimination

Induction variables and strength reduction

Common expression elimination

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

Page 54: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

x := t3

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := xgoto B2

t3

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

Dead code elimination (again)

t4 := t4 – 4

t2 := t2 + 4

t2 >= t4

t3

t2 := 4 * i→ t4 := 4 * j

Page 55: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

t14 := a [ t1 ]a [ t2 ] := t14

a [ t1 ] := t3

a [ t2 ] := t5

a [ t4 ] := t3

goto B2

if t2 >= t45 goto B6

t4 := t4 – 1t5 := a [ t4 ]if t5 > v goto B3

t2 := t2 + 4t3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]t4 := 4 * jt2 := 4 * i

t11 := 4 * ix := a [ t11 ]t12 := 4 * it13 := 4 * nt14 := a [ t13 ]a [ t12 ] := t14

t15 := 4 * na [ t15 ] := x

t6 := 4 * ix := a [ t6 ]t7 := 4 * it8 := 4 * jt9 := a [ t8 ]a [ t7 ] := t9

t10 := 4 * ja [ t10 ] := xgoto B2

if i >= j goto B6

j := j – 1t4 := 4 * jt5 := a [ t4 ]if t5 > v goto B3

i := i + 1t2 := 4 * it3 := a [ t2 ]if t3 < v goto B2

i := m – 1j : = nt1 := 4 * nv := a [ t1 ]

Global common expression elimination

copy propagation

dead code elimination

code motion

induction variables and reduction in strength

Page 56: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Data Flow Analysis

Page 57: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

57

Data Flow Analysis

• Global optimizations. • We need to understand the flow of data in the program to be able to change

code wisely and correctly. • This understanding will come from an analysis called data flow analysis or

DFA• It’s a set of algorithms, all having the same generic frame, and their specifics

are determined by the information we are after. • Used for optimizations, but also for verification.

• Do not mix shortening with double meanings…

DFA CFG

Front-end Deterministic Final Automata Context Free Grammar

Back-end Data Flow Analysis Control Flow Graph

Page 58: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

The Idea

• We are given a graph of “program constructs”– Single instructions, Basic blocks, etc.

• The algorithm works in iterations• In each iteration we update the information for each

node in the graph according to the information in its neighbors. – A global view is never necessary.

• The algorithm terminates when no node gets updated. – Typically termination is guaranteed since knowledge is

increased in each iteration and there is a limit on knowledge size.

58

Page 59: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

59

DFA – The Generic Algorithm• The general structure of any DFA algorithm is: N1…Nn – information about the n program nodes

(could be variables, basic blocks, etc.)

for i in {1…n}: initialize Ni

bool change = true;while (change) {

change = false;for i in {1…n}:

M = new value for Ni (funciton of neighbors of Ni).

if (M Ni) thenchange = true;

Ni = M}

Specific instantiations of the DFA generic structure differ in the initialization of N, and the computation of the new values.

Page 60: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

60

Reaching Definitions –

• A definition is an assignment of variable v. • A definition d reaches point p in the program, if there is at least one

possible execution path from the definition d to the point p such that there is no new definition of the same variable along the path.

• The influence of the assignment a = b+c :– It uses the variables b and c, – It kills any previous definition of a. – It generates a new definition for a.

• Similarly, the instruction “if (a<3)” uses a but does not kill nor generate and definition.

Page 61: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

61

Reformulating “Reaching Definitions”

• A definition: an assignment that gives value to a variable v.• Reaching: a definition d reaches a point p in the program if there is

at least one path from the definition d to p such that d is not killed on the path.

• Finding reaching definitions: • Find all definitions that reach any point in the program. • This information can be used for optimizations. • This seems to require going over all paths and all definitions in the

entire program.

• (We usually think of the program as a single method (or routine). Inter-procedural analysis examines full modules, classes, or even whole programs. )

Page 62: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

62

Computing Reaching Definitions with DFA

• Let Ni:A be the set of all reaching definitions of variable A in line i. – There is a DFA variable for each program variable and each code line.

• We start with a subset and gradually enlarge it until it contains all reaching definitions.

• When there are no more possible enlargements available, we know we’re done.

• Usually, we don’t consider copying (A=B) as an assignment because A and B are usually united during the optimization.

Page 63: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

63

Computing Reaching Definitions with DFA

• Recall that Ni:A be the set of all reaching definitions of variable A in line i.

• Initialization: if in line i there is an assignment of a constant, an expression, or a function to variable A, i.e., non-copying assignment,

then Ni:A = {i}. (In this case this is the final value.) Otherwise, Ni:A = (we need to compute this value in iterations.)

• Iteration step:– If in line i variable A is not updated, then

Ni:A = Nx:A Ny:A Nz:A …where x, y, z, etc. are all the lines from which we can directly go to line i.

– If line i contains a copy A=B, we set Ni:A = Ni:B.

Page 64: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

64

Reaching Definitions: an Example

(1) if (b == 4) goto (4)

(2) a = 5

(3) goto 5

(4) a = 3

(5) if (a > 4) goto 4

(6) c = a

ni:a

ni:c

{2}

{4}

ni:a ni:c

{2} {2} {4} {2,4} {2,4}

{2,4}

ni:a ni:c

{2} {2} {4} {2,4} {2,4}

{2,4}

Page 65: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Correctness Idea

• If x gets updated in some line i, then after k iterations, a line that can be executed k steps after i is updated and “knows” that i is a definition for x (if there is no closer definition of x on the path from i to it).

• Proof by induction on the “distance” of the definition from the set being updated.

• As the program is finite, the longest (non-cyclic) path in it is finite as well.

• Note that if there is an iteration with no updates, then there will not be any updates in subsequent iterations.

• We do not provide a full proof.

65

Page 66: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

66

A Standard Saving• Running iterations with all instructions is costly for large programs. • A standard solution: run the iterations for basic blocks and not for

each single instruction. • Instead of working with the program instructions graph, we work

with the control flow graph of the basic blocks. • Obtain a smaller graph and (much) faster algorithm. • Sometimes the operations inside the basic block cancel each other

and then the computation becomes easier. • The output is the reaching definition for each block and not each

code line. – Good enough for optimizations– Can be easily extended for each line inside any given block.

Page 67: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

67

How Does it Look Inside a Basic Block?

• We have seen earlier the impact of a single instruction like “a = b+c”.

• The impact of a basic block is the sum of all influences in the block.

Page 68: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

68

How Does it Look Inside a Basic Block?

• A block uses a variable v if there exists an instruction p that uses v

and there is no point p0 prior to p1 that that defines v locally.

– Exposed use of v.

– Simply put: p1 uses v’s value that was set before the block started.

• A block kills a definition d of variable v if there is an instruction in the block that defines v.

• A definition d to variable v is generated in a block if the definition d is

at location p1, and there is no instruction p2 subsequent to p1 that

defines v as well. – Locally generated definition.– Simply put: the generated definitions are the definitions of B that do not

get killed inside B.

Page 69: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

69

Basic Block Reaching Definitions

• Use DFA to find reaching definitions to all basic blocks. • Data structure:

– IN[B]: all definitions reaching the beginning of B– OUT[B]: all definitions reaching the end of B

• Each assignment gets a name di, and we compute ahead of time the two sets GEN[B] and KILL[B] for each block B. – GEN[B]: set of all definitions generated in B, e.g. GEN[B]={d3,d7,d8}.– KILL[B]: set of all definitions killed in B. In fact, set of all program

definitions that set a value to a variable v that is also assigned in B.

Page 70: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

70

Computing Reaching Definitions with DFA

• DFA Initialization: for each block B,– IN[B] = – OUT[B] =

• DFA step: – For each block B, re-compute OUT[B] given IN[B], based only on the

instructions of B.

– OUT[B] = ƒB(IN[B])

Page 71: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

71

The DFA Step (cont’d)• We need to compute reaching definitions in the end of the block given

reaching definitions in the beginning. • End-of-block reaching definitions = B’s generated definitions +

(definitions that reach the beginning of B – B’s killed definitions)• In other words:

OUT[B] = ƒB(IN[B]) = GEN[B] (IN[B] \ KILL[B])

• To obtain IN[B] we do: IN[B] = OUT[b1] OUT[b2] … OUT[bk] where b1, b2, … , bk are the blocks that reach B directly.

• IN[B] is computed before OUT[B] (which depends on IN[B]).

Page 72: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

72

An Example

i = 1m = a[0]

t = a[i]if (t > m)

m = t

i = i + 1if (i < 10)

B1

B2

B3

B4

B5

d1d2

d3

d5

d4

KILL GEN

{d4,d5} {d1,d2} B1 {d3} B2

{d2} {d4} B3

{d1} {d5} B4 B5

IN[B1] = OUT[B1] = {d1,d2}

IN[B2] = {d1,d2}OUT[B2] = {d1,d2,d3}

IN[B3] = {d1,d2,d3}OUT[B3] = {d1,d3,d4}

IN[B4] = {d1,d2,d3,d4}OUT[B4] = {d2,d3,d4,d5}

IN[B5] = OUT[B5] = {d2,d3,d4,d5}

IN[B2] = {d1,d2,d3,d4,d5}OUT[B2] = {d1,d2,d3,d4,d5}

IN[B3] = {d1,d2,d3,d4,d5}OUT[B3] = {d1,d3,d4,d5}

IN[B4] = {d1,d2,d3,d4,d5}

Page 73: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Reaching Definitions with Basic Blocks• As always, execution terminates when there are no modifications in

one iteration. • At the end, the reaching definitions of block B are IN[B].

• We will not prove the algorithm. Some properties: • The values of IN and OUT are always a subset of their real values. • Each definition can only increase the sizes of the subsets. • The final size of is bounded (by the number of definitions in the

program) and hence termination is guaranteed.

• Correctness: – if definition d reaches block B in a path of k blocks, then after k

iterations, IN[B] includes d. – A definition that does not reach B will never enter IN[B].

73

Page 74: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Blocks versus Instructions

74

Instructions Basic Blocks

Work per iteration

Each iteration goes over each pair of line i and variable v

Each iteration goes over each block only

number of iterations

The longest instruction path between a definition and its use

The path length is in blocks and so is the iterations number

Number of variables

number of program variables * number of code lines

Number of basic blocks

Computation OUT[i,a] = GEN[i,a] (and if empty then IN[i,a])

IN[i,a] = OUT[i1] OUT[i2] … OUT[ik]

OUT[B] = GEN[B] (IN[B] \ KILL[B])

IN[B] = OUT[b1] OUT[b2] … OUT[bk]

The output An accurate information for each code line and each variable.

An information on each block on entry and exit.

Page 75: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

75

Uses of Reachable Definitions

• Determine that a variable has a constant value at a given point.

• Identify a variable that is not initialized

int i;

if (…) i = 3;

x = i; ← error: i might have not been initialized

• In OOP: identify an impossible downcast. • And more…

Page 76: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

76

Using DFA for Liveness Analysis

• Definition: A variable v is live in program point p if there is an execution path starting at p, in which there is a use of v before it is defined again.

• We’ve seen previously how to determine v’s liveness inside a basic block. – By going backwards line by line in the block.

• Now let’s do the same computation for a full procedure (or program) – We decide which variables are alive on entry to each basic

block.– Go “backwards” on the CFG.

Page 77: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

77

DFA Initialization and Computation.

• IN[B] and OUT[B] which are now sets of variables (and not sets of definitions).

• Initializing the DFA: for all blocks B, IN[B]=OUT[B]=.• Compute in advance:

Use[B]: set of variables that B uses (without redefining them before use)DEF[B]: set of variables that B generates a definition for.

• Computation step:

OUT[B] = IN[b1] IN[b2] ... IN[bn] , where b1,…,bn are all

blocks reachable from B.

IN[B] = fB(OUT[B]) = USE[B] (OUT[B] \ DEF[B])

Page 78: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

78

Another Example: Available Expression

• Typically, we assume an entry node B0 in the control flow graph, from

which the computation starts. • Definition: an expression x OP y is available at Point p if each path

from the entry point to p has a computation of x OP y with no subsequent update of x or y before reaching p.

• Use for optimization: do not re-compute available expressions. • Talking basic blocks:

– We say that a block kills the expression x OP y if the block assigns a value to x or y and does not re-compute x OP y after the assignment.

– A block generates the expression x OP y if it computes x OP y and does not update x or y after the computation.

Page 79: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

79

Data, Initialization, Computation

• IN[B] and OUT[B] are sets of expressions. • Initializing the DFA: IN[B]=OUT[B]= for all blocks B. • Compute ahead of time:

eKill[B]: set of expressions that B kills by changing one of the variables in the expression. eGen[B]: set of expressions that B generates.

• Computation step:

IN[B]=OUT[b1] OUT[b2] … OUT[bn]

where b1…bn are all blocks from which B is (directly) reachable.

OUT[B] = ƒB(IN[B]) = eGEN[B] (IN[B] \ eKILL[B])

Page 80: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Is Everything Fine?

• Not Really…• Consider the graph on the left.

• IN[B2] = OUT[B1] OUT[B2].

• Suppose “x+y” is computed in B1 but

not in B2.

(and B2 does not kill it).

• Then it is available in B2, but IN[B2] will

never see that.

• The problem: outputs of B1 are

available to B2 and should not be

eliminated because of it.

80

B1

B2

Page 81: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Solution: Proper Inialization

• Create an empty entry block B0 and set OUT[B0]=

• But for all other blocks set OUT[Bi]=U, where U is the set of all expressions computed in any basic block.

• The computation step remains IN[B] = OUT[b1] OUT[b2] … OUT[bn]where b1…bn are all blocks from which B is (directly) reachable, and OUT[B] = ƒB(IN[B]) = eGEN[B]

(IN[B] \ eKILL[B])

• 81

B1

B2

Page 82: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Example

82

Z=x+y

W=x+y

B0

Z=x+yX=7

W=x+y

OUT[B0] =

OUT[B1] = U

OUT[B2] = U

OUT[B3] = U

OUT[B4] = U

Page 83: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Example

83

Z=x+y

W=x+v

B0

X=7

W=x+y

OUT[B0] =

OUT[B1] = U

OUT[B2] = U

OUT[B3] = U

OUT[B4] = U

IN[B1] =

IN[B2] = U

IN[B3] = U

IN[B4] = U

{x+y}

{x+y , x+v}

{x+y , x+v}

{x+y}

{x+y}

Page 84: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Correctness

• Information flows from blocks to their neighbors during DFA steps.

• The values in IN[B0] and OUT[B0] are always empty and correct.

• The values of OUT[B] are always a superset of the available expressions on exit from B.

• Induction: after n iterations, OUT[B] is correct for all blocks whose

distance from B0 is less than n.

• Proof idea: if there is a path of length n from B0 to block B in which “x+y” is not computed, then after n iteration OUT[B] will not include “x+y”.

• We do not provide a full proof. • But note that the initialization is crucial.

84

Page 85: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

85

ComparisonReaching definitions

Liveness Available expressions

DFA variables are sets of...

definitions program variables

expressions

Computation direction

forwards:OUT[B] =

ƒB(IN[B])

backwards:IN[B] =

ƒB(OUT[B])

forwards:OUT[B] =

ƒB(IN[B])

computation step ƒb(x)

GEN[B] (x \ KILL[B])

USE[B] (x \ DEF[B])

eGEN[B] (x \ eKILL[B])

פעפוע מהשכנים

predאיחוד ה- succאיחוד ה- predחיתוך ה-

Initialization OUT[B]=U

Page 86: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Optimizations Summary

• Improve performance, while preserving semantics. • We only mentioned running time optimization (which is common). • Represent the program as a DAG helps identifying common

expressions, eliminate redundant copying, and other analysis. • Often, aliasing makes things tougher. • Basic optimizations: common subexpression elimination, copy

propagation, code motion, strength reduction, dead-code elimination.

• Local optimization framework: peephole optimization• A generic algorithm for computing global information: Data Flow

Analysis. • DFA examples: reaching definitions (at the instruction and at the

basic block level), liveness analysis, available expressions.

86

Page 87: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Course Summary• Lexical analysis: find tokens using DFA. • Parsing: analyze structure using context-free grammars.

– Top-down (LL), bottom-up(LR), lookahead…

• Semantic analysis computes attributes of grammar variables– Many times can be done during parsing. – Check types, and create intermediate code

• Runtime: runtime stacks, memory management. • Code generation: simple code generation and register allocation. • Optimizations, and analysis they employ

87

Page 88: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Administration

• Exam on Thursday February 7th. • 20% for “don’t know” (not for a missing answer). • Material: everything that appeared in lectures, exercises, and

homework.• During the last lecture (Thursday 24/1) Adi will run a rehearsal

exercise her in Taub 2. • Help in solving previous year tests in TA’s reception hours.

,מועד ג': רק למילואימניקיםאנא הודיעו גם למרצה וגם למתרגל האחראי על צורך פוטנציאלי במועד ג' עד

.'שבועיים לפני המבחן של מועד ב

88

Page 89: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

Typical Test Questions

• Short questions: – What happens during compile time and what during the

execution? – When are errors discovered?

• Parsing: – Build an LR grammar for the language – Is the following grammar in LR(0), SLR(1), LALR(1), LR(1)– Something else…

• Runtime• Backpatching• DFA

Page 90: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

90

A question about reference counting

• Q: What is the number of references that can refer to a specific given location?

• A: Entire virtual memory. (Implication: RC size = pointer size)

• Q: Suppose RC has only 3 bits. How can an overflow happen? • A: 9 pointers reference an object. • Q: suppose we consider an RC that has reached “111” as “stuck”

and never change it anymore. How does that influence the execution?

– Will the program run correctly? – Will it consume more memory?

• A: It will run correctly and consume more memory.

Page 91: Theory of Compilation 236360 Erez Petrank Lecture 11: Optimizations 1.

91

A question about reference counting

• Q: Propose a manner to fix all stuck counts of live objects. • A: Run tracing (like in mark-sweep).

Upon checking an edge, increment the RC of the child. (Of-course, RC may get stuck again, but those that should not will not…)