Top Banner
Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005
24

Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Data Flow Analysis 3

15-411 Compiler Design

Nov. 8, 2005

Page 2: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Key Reference on Global Optimization

Gary A. Kildall, A Unified Approach to Global Program Optimization, ACM Symposium on Principles of Programming Languages, 1973, pages 194-206.

From the abstract:

“A technique is presented for global analysis of object code generated for expressions. The global expression optimization presented includes constant propagation, common sub-expression elimination, elimination of redundant register load operations and live expression analysis. A general purpose program flow analysis algorithm is developed which depends on an optimizing function. The algorithm is defined formally using a directed graph model of program flow structure and is shown to be correct. …”

Page 3: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Kildall’s Contribution

•A number of techniques had been developed for compile-time optimization to

locate redundant computations,

perform constant computations,

reduce the number of store-load sequences, etc.

•Some provided analysis of only straight-line sequences of instructions; others tried to take program branching into account.

•Kildall gave a single unified flow analysis algorithm which extended all the straight-line techniques to include branching.

•He stated the algorithm formally and proved it correct in his POPL paper.

Page 4: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation – Example program

begin

integer i, a, b, c, d, e;

a := 1; c:=0; …

for i :=1 step 1 until 10 do

begin b:= 2; …

d := a + b; …

e := b + c; …

c := 4; …

end

end

Page 5: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Directed Graph Representation

Nodes represent sequences of instructions with no branches. Edges represent control flow between nodes.

Page 6: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation

Convenient to associate a pool of propagated constants with each node in the graph.

Pool is a set of ordered pairs which indicate variables that have constant values when node is encountered.

The pool at node B denoted by PB consists of a single element (a,1) since the assignment a:= 1 must occur before B.

Page 7: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation (cont.)

Fundamental problem of constant propagation is to determine the pool of constants for each node in an arbitrary program graph.

By inspection of the program graph for the example, the pool of constants at each node is

PA = PB = {(a, 1)} PC = {(a, 1)} PD = {(a, 1), (b, 2)}

PE = {(a, 1), (b, 2), (d, 3)} PF = {(a, 1), (b, 2), (d, 3)}

Page 8: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation (cont.)

PN may be determined for each node N in the graph as follows:

Consider each path (A, p1,p2, …, pn,N). Apply constant propagation along path to obtain set of constants at node N.

Intersection for each path to N is the set of constants which can be assumed for optimization.

(It is unknown what path will be taken at execution time, so intersection is conservative choice)

Page 9: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Global Analysis Algorithm--Informal

• Start with an entry node in the program graph, along with a given entry pool corresponding to this entry node.

• Process the entry node and produce optimization information for all immediate successors of the entry node.

• Intersect incoming optimizing pools with already established pools at the successor nodes.

(First time node is encountered, assume incoming pool is first approximation and continue processing.)

• for each successor, if amount of optimizing information is reduced by this intersection, then process successor like initial entry node.

Page 10: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Global Analysis Algorithm (cont)It is useful to define an optimizing function f which maps an input pool together with a particular node to a new output pool.

Given a set of propagated constants, it is possible to examine the operation of a particular node and determine the set of constants that can be assumed after the node is executed.

In the case of constant propagation, let V be a set of variables, C be a set of constants, and N be the set of nodes in the graph.

The set U = V £ C represents ordered pairs which may appear in any constant pool.

In fact, all constant pools are elements of the power set U, denoted P(U).

Thus, f: N £ P(U) ! P(U), where (v, c) 2 f(N, P) if and only if

(cont.)

Page 11: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Global Analysis Algorithm (cont.)

1. (v, c) 2 P and the operation at node N does not assign a new value to the variable v.

2. The operation at N assigns an expression to the variable v, and the expression evaluates to the constant c.

Page 12: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation (cont.)

Successively longer paths from A to D can be evaluated, resulting in PD,3 , PD,4 , …, PD,n

for arbitrarily large n.

The pool of constants that can be assumed no matter what flow of control occurs is the set of constants common to all PD,i , i.e.

Åi PD,i

This procedure is not effective since the number of such paths may have no finite bound, and the procedure would not halt.

Page 13: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Optimization Function for Example

The optimizing function can be applied to node A with an empty constant pool resulting in

f(A, ; ) = {(a,1)}.

The function can be applied to B with {(a, 1)} as the constant pool yielding

f(B, {(a, 1)}) = {(a, 1), (c, 0)}.

Page 14: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Extending f to Paths in the Graph

Given a path from entry node A to an arbitrary node N, optimizing pool for path is determined by composing the function f.

For example, f(C, f(B, f(A, ;))) = {(a, 1), (c, 0), (b, 2)} is the constant pool for D for this path.

Page 15: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Constant Propagation (cont.)

The pool of propagated constants at node D can be determined as follows:

A path from entry node A to the node D is (A, B, C, D). For this path the first approximation to the pool for D is

PD,1 = {(a, 1), (b, 2), (c, 0)}.

A longer path from A to D is (A, B, C, D, E, F, C, D) which results in the pool

PD,2 = {(a, 1), (b, 2), (c, 4), (d, 3), (e, 2)}.

Page 16: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Computing the Pool of Optimizing Information.

The pool of optimizing information which can be assumed at node N in the graph, independent of the path taken at execution time, is

PN = Å {x | x 2 FN}.

Here FN = { f(pn, f(pn-1, …, f(p1, P))…)| (p1, p2, …, pn, N) is a path from an entry node p1 with corresponding entry pool P to node N}.

Page 17: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Directed Graphs and Paths

A finite directed graph G = <N,E> is an arbitrary finite set of nodes N and edges E ½ N £ N.

A path from node A to node B in G is a sequence (p1, p2, …, pk ) such that p1 = A and pk = B where (pi, pi+1) 2 E for 16 i < k.

The length of the path is k – 1.

Page 18: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Program Graphs

A program graph is a finite directed graph G with a non-empty set of entry nodes I ½ N.

Given N 2 N we assume there exists a path (p1, p2, …, pn) such that p1 2 I and pn = N.

(i.e., there is a path to every node in the graph from an entry node.)

Page 19: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Successors and Predecessors of a Node

The set of immediate successors of a node N is given by

I(N) = { N’ 2 N | 9 (N,N’) 2 E}.

The set of immediate predecessors of N is given by

I-1(N) = {N’ 2 N| 9 (N’, N) 2 E}.

Page 20: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Meet-Semilatticies

Let the finite set L be the set of all possible optimizing pools for a given application.

Let Æ be a meet operation with the properties:

Æ : L £ L ! L

x Æ y = y Æ x

x Æ (y Æ z) = (x Æ y) Æ z

where x, y z 2 L. The set L and the Æ operation define a finite meet-semilattice.

Page 21: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Ordering on Meet-Semilattices

The Æ operation defines a partial ordering on L by

x 6 y if and only if x Æ y = x.

Similarly,

x < y if and only if x 6y and x y.

Page 22: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Generalized Meet Operation

If X ½ L, the generalized meet operation Æ X is defined as the pairwise application of Æ to the elements of X.

L is assumed to have a “zero element” 0 such that 0 6 x for all x 2 L.

An augmented set L’ is constructed from L by adding a “unit element” 1 such that 1 is not in L and 1 Æ x = x for all x in L.

The set L’ = L [ {1}. It follows that x <1 for all x in L.

Page 23: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Optimizing Function

An “optimizing function” f is defined

f: N £ L ! L .

It must have the homomorphism property:

F(N, x Æ y) = f(N, x) Æ f(N, y) for all N 2 N and x, y 2 L.

Note that f(N, x) < 1 for all N 2 N and x 2 L.

Page 24: Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Global Analysis AlgorithmGlobal analysis starts with an entry pool set EP ½ I £ L, where (e, x) 2 EP if e 2 I is an entry node with optimizing pool x 2 L.

A1 [initialize] L := EP.

A2 [terminate ?] If L = ; then halt.

A3 [select node] Let L’ 2 L, L’ = (N, Pi) for some N 2 N and Pi 2 L.

Then L := L – {L’}.

A4 [Traverse] Let PN be the current approximate pool for node N

(Initially PN = 1). If PN 6 Pi the go to step A2.

A5 [set pool] PN := PN Æ Pi, L:= L [ {(N’, f(N, PN)) | N’ 2 I(N)}.

A6 [Loop] Go to step A2.