Top Banner
Cd Code Oii i Optimization P K Singh 1 M M M Engg. College, Gorakhpur
32
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Optimization

C dCodeO i i iOptimization

P K Singh 1M M M Engg. College, Gorakhpur

Page 2: Optimization

Code Optimization and Phases

source front codeintermediate

code codeintermediate

code targetsourceprogram

frontend

codeoptimizer

code codegenerator

code targetcode

symboltable

P K Singh 2M M M Engg. College, Gorakhpur

Page 3: Optimization

Optimization• Code produced by standard algorithms can often be made to run

faster, take less space or both• These improvements are achieved through transformations called• These improvements are achieved through transformations called

optimization• Compilers that apply these transformations are called optimizing

compilerscompilers• It is especially important to optimize frequently executed parts of a

programThe 90/10 rule– The 90/10 rule

– Profiling can be helpful, but compilers do not typically have sample input data

– Inner loops tend to be good candidates for optimizationInner loops tend to be good candidates for optimization

P K Singh 3M M M Engg. College, Gorakhpur

Page 4: Optimization

Criteria for Transformations

• A transformation must preserve the meaning of a programa program– Can not change the output produced for any input– Can not introduce an errorCan not introduce an error

• Transformations should, on average, speed up programsprograms

• Transformations should be worth the effort

P K Singh 4M M M Engg. College, Gorakhpur

Page 5: Optimization

Beyond Optimizing Compilers• Really improvements can be made at various phases• Source code:

Algorithmic transformations can produce spectacular– Algorithmic transformations can produce spectacular improvements

– Profiling can be helpful to focus a programmer's attention on important codep

• Intermediate code:– Compiler can improve loops, procedure calls, and address

calculationscalculations– Typically only optimizing compilers include this phase

• Target code:Compilers can use registers efficiently– Compilers can use registers efficiently

– Peephole transformation can be applied

P K Singh 5M M M Engg. College, Gorakhpur

Page 6: Optimization

Peephole Optimizations• A simple technique for locally improving target

code (can also be applied to intermediate code)( )• The peephole is a small, moving window on the

target program• Each improvement replaces the instructions of

the peephole with a shorter or faster sequence• Each improvement may create opportunities for

additional improvements• Repeated passes may be necessary

P K Singh 6M M M Engg. College, Gorakhpur

Page 7: Optimization

Redundant-Instruction Elimination

• Redundant loads and stores:(1)MOV R0 a(1)MOV R0, a(2)MOV a, R0

• Unreachable code:#define debug 0if (debug) {/* i t d b i i f ti *//* print debugging information */

}

P K Singh 7M M M Engg. College, Gorakhpur

Page 8: Optimization

Flow-of-Control Optimizations• Jumps to jumps, jumps to conditional jumps, and conditional jumps

to jumps are not necessary• Jumps to jumps example:p j p p

– The following replacement is valid:

goto L1

goto L2

– If there are no other jumps to L1 and L1 is preceded by an unconditional

L1: goto L2 L1: goto L2

– If there are no other jumps to L1 and L1 is preceded by an unconditional jump, the statement at L1 can be eliminated

• Jumps to conditional jumps and conditional jumps to jumps lead to similar transformations

P K Singh 8M M M Engg. College, Gorakhpur

Page 9: Optimization

Other Peephole Optimizationsp p• A few algebraic identities that occur frequently

(such as x := x + 0 or x := x * 1) can be eliminated

• Reduction in strength replaces expensive operations with cheaper onesoperations with cheaper ones

– Calculating x * x is likely much cheaper than x^2 using an exponentiation routine

– It may be cheaper to implement x * 5 as x * 4 + x• Some machines may have hardware instructions to

implement certain specific operations efficiently– For example, auto-increment may be cheaper than a straight-For example, auto increment may be cheaper than a straight

forward x := x + 1– Auto-increment and auto-decrement are also useful when

pushing into or popping off of a stackp g p pp g

P K Singh 9M M M Engg. College, Gorakhpur

Page 10: Optimization

Optimizing Intermediate Code

• This phase is generally only included in optimizing compilersoptimizing compilers

• Offers the following advantages:Operations needed to implement high level constructs– Operations needed to implement high-level constructs are made explicit (i.e. address calculations)

– Intermediate code is independent of target machine; p g ;code generator can be replaced for different machine

• We are assuming intermediate code uses three-address instructions

P K Singh 10M M M Engg. College, Gorakhpur

Page 11: Optimization

QuickSort in Cvoid quicksort(int m, int n) {int i, j, v, x;

if (n <= m) return;if (n < m) return;

/* Start of partition code */i = m-1; j = n; v =a[n];while (1) {do i = i+1; while (a[i] < v);do j = j-1; while (a[j] > v);if (i >= j) break;( j) ;x = a[i]; a[i] = a[j]; a[j] = x;

}x = a[i]; a[i] = a[n]; a[n] = x;/* E d f titi d *//* End of partition code */

quicksort(m, j); quicksort(i+1, n);}

P K Singh 11M M M Engg. College, Gorakhpur

Page 12: Optimization

Partition in Three-Address Code

(1) i := m-1(2) j := n(3) 1 4*

(16) t7 := 4*i(17) t8 := 4*j(18) 9 [ 8](3) t1 := 4*n

(4) v := a[t1](5) i := i+1(6) t2 := 4*i

(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x( )

(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 : 4*j

( ) [ ](22) goto (5)(23) t11 := 4*i(24) x := a[t11](25) t12 : 4*i(10) t4 := 4*j

(11) t5 := a[t4](12) if t5 > v goto (9)(13) if i >= j goto (23)

(25) t12 := 4*i(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14

(14) t6 := 4*i(15) x := a[t6]

(29) t15 := 4*n(30) a[t15] := x

P K Singh 12M M M Engg. College, Gorakhpur

Page 13: Optimization

Partition Flow Graph

P K Singh 13M M M Engg. College, Gorakhpur

Page 14: Optimization

Local vs. Global Transformations• Local transformations involve statements within a

single basic block• All other transformations are called global

transformations• Local transformations are generally performed first• Many types of transformations can be performed

either locally or globally

P K Singh 14M M M Engg. College, Gorakhpur

Page 15: Optimization

Code Optimizations

• Local/global common subexpression elimination• Dead code elimination• Dead-code elimination• Instruction reordering

C t t f ldi• Constant folding• Algebraic transformations

C i• Copy propagation• Loop optimizations

15P K Singh M M M Engg. College, Gorakhpur

Page 16: Optimization

Loop Optimizations

• Code motion• Induction variable elimination• Induction variable elimination• Reduction in strength

l t• … lots more

16P K Singh M M M Engg. College, Gorakhpur

Page 17: Optimization

Code Motion

i := 0

t2 := 4*i

B1:

B2:

i := 0t1 := n-2

B1:

t :A[t2] := 0i := i+1

t2 := 4*iA[t2] := 0i := i+1

B2:

t1 := n-2if i < t1 goto B2

B3:if i < t1 goto B2B3:

Move loop-invariant computations before the loop

17P K Singh M M M Engg. College, Gorakhpur

Page 18: Optimization

Strength Reduction

i := 0t1 := n-2

B1: i := 0t1 := n-2t2 := 4*i

B1:

t2 := 4*iA[t2] := 0i := i+1

B2: A[t2] := 0i := i+1t2 := t2+4

B2:

if i < t1 goto B2B3: if i < t1 goto B2B3:

Replace expensive computations with induction variables

18P K Singh M M M Engg. College, Gorakhpur

Page 19: Optimization

Reduction Variable Elimination

i 0B1: t1 4*B1:i := 0t1 := n-2t2 := 4*i

B1:

B2

t1 := 4*nt1 := t1-8t2 := 4*i

B1:

A[t2] := 0i := i+1t2 := t2+4

B2:A[t2] := 0t2 := t2+4

B2:

if i<t1 goto B2B3: if t2<t1 goto B2B3:

Replace induction variable in expressions with another

19P K Singh M M M Engg. College, Gorakhpur

Page 20: Optimization

Common Subexpressions

• E is a common subexpression if:– E was previously computed– E was previously computed– Variables in E have not changed since previous

computation• Can avoid recomputing E if previously computed

value is still available• Dags are useful to detect common subexpressions

P K Singh 20M M M Engg. College, Gorakhpur

Page 21: Optimization

Local Common Subexpressionsp

t6 := 4*ix := a[t6]7 4*i

t6 := 4*i[ 6]t7 := 4*i

t8 := 4*jt9 := a[t8]a[t7] := t9

x := a[t6]t8 := 4*jt9 := a[t8]a[t6] := t9a[t7] := t9

t10 := 4*ja[t10] := xgoto B2

a[t6] := t9a[t8] := xgoto B2

goto B2

P K Singh 21M M M Engg. College, Gorakhpur

Page 22: Optimization

Global Common Subexpressionsp

P K Singh 22M M M Engg. College, Gorakhpur

Page 23: Optimization

Copy Propagation

• Assignments of the form f := g are called copy statements (or copies)

• The idea behind copy propagation is to use g for fwhenever possible after such a statement

• For example applied to block B5 of the previous• For example, applied to block B5 of the previous flow graph, we obtain:x := t3[t2] t5a[t2] := t5

a[t4] := t3goto B2

Copy propagation often turns the copy statement• Copy propagation often turns the copy statement into "dead code"

P K Singh 23M M M Engg. College, Gorakhpur

Page 24: Optimization

Dead-Code Elimination

• Dead code includes code that can never be reached and code that computes a value that never gets usedC id if (d b ) i• Consider: if (debug) print …– It can sometimes be deduced at compile time that the value of an

expression is constant– Then the constant can be used in place of the expression (constant– Then the constant can be used in place of the expression (constant

folding)– Let's assume a previous statement assigns debug := false and

value never changes– Then the print statement becomes unreachable and can be

eliminated• Consider the example from the previous slide

The value of computed by the copy statement never gets used after– The value of x computed by the copy statement never gets used after the copy propagation

– The copy statement is now dead code and can be eliminated

P K Singh 24M M M Engg. College, Gorakhpur

Page 25: Optimization

Loop Optimizations (1)

• The running time of a program may be improved if we do both of the following:we do both of the following:– Decrease the number of statements in an inner loop– Increase the number of statements in the outer loopp

• Code motion moves code outside of a loop– For example, consider:p ,

while (i <= limit-2) …

– The result of code motion would be:t li it 2t = limit – 2while (i <= t) …

P K Singh 25M M M Engg. College, Gorakhpur

Page 26: Optimization

Loop Optimizations (2)

• Induction variables: variables that remain in "lock-step"– For example, in block B3 of previous flow graph, j and t4 are

induction variablesinduction variables– Induction-variable elimination can sometimes eliminate all but one of

a set of induction variables• Reduction in strength replaces a more expensive operationReduction in strength replaces a more expensive operation

with a less expensive one– For example, in block B3, t4 decreases by four with every iteration– If initialized correctly, can replace multiplication with subtractiony– Often application of reduction in strength leads to induction-variable

elimination• Methods exist to recognize induction variables and apply

i t t f ti t ti llappropriate transformations automatically

P K Singh 26M M M Engg. College, Gorakhpur

Page 27: Optimization

Loop Optimization Examplep p p

P K Singh 27M M M Engg. College, Gorakhpur

Page 28: Optimization

Example

• Given the following code segment, obtain:The 3AC statements for this computation– The 3AC statements for this computation

– The basic blocks and the flow graphprod = 0; i = 1;prod 0; i 1;do {

prod = prod + a[i] * b[i];prod = prod + a[i] b[i];i = i + 1;

} while ( i <= 20 );} while ( i <= 20 );

P K Singh 28M M M Engg. College, Gorakhpur

Page 29: Optimization

Example …contd

• 3AC for the given code segment(1) prod :=0(2) i := 1(3) t1 := 4 * i (9) prod := t6(4) t2 := a [t1] (10) t7 := i + 1(5) t3 := 4 * i (11) i := t7(6) t4 := b [t3] (12) if i <= 20 goto (3)(7) t5 := t2 * t4(7) t5 := t2 * t4(8) t6 := prod + t5

P K Singh 29M M M Engg. College, Gorakhpur

Page 30: Optimization

Example …contd

2 Basic blocks(1) prod :=0(2) i := 1

B1

(3) t1 := 4 * i (9) prod := t6(4) t2 := a [t1] (10) t7 := i + 1(5) t3 := 4 * i (11) i := t7(6) t4 := b [t3] (12) if i <= 20 goto (3)(7) t5 := t2 * t4(7) t5 := t2 * t4(8) t6 := prod + t5

B2

P K Singh 30M M M Engg. College, Gorakhpur

B2

Page 31: Optimization

Example …contd

Flow Grapht1 := 4 * iB2

prod := 0

t1 := 4 it2 := a [t1]t3 := 4 * i t4 := b [t3] B1

B2

i := 1 t5 := t2 * t4t6 := prod + t5 prod := t6t7 := i + 1t7 : i 1i := t7if i <= 20 goto B2B1

B2

P K Singh 31M M M Engg. College, Gorakhpur

Page 32: Optimization

Implementing a Code Optimizer• Organization consists of control-flow analysis, then data-

flow analysis, and then transformations• The code generator is applied to the transformedThe code generator is applied to the transformed

intermediate code• Details of code optimization are beyond the scope of this

coursecourse

P K Singh 32M M M Engg. College, Gorakhpur