A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

Post on 26-Mar-2015

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

A Framework for UnrestrictedA Framework for UnrestrictedWhole-Program OptimizationWhole-Program Optimization

Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I.

August

The Liberty Research GroupDepartment of Computer Science

Princeton University

2 Princeton University

Velocity Compiler Research

Procedure-Based CompilationProcedure-Based Compilation

if (EOB)

d=a*bfill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)Procedure-Based

CompilationPros:• Well KnownCons:• Can not exploit

opportunities that cross procedures

…=z

3 Princeton University

Velocity Compiler Research

Interprocedural AnalysisInterprocedural Analysis

Interprocedural Analysis[Sharir’78] [Morel’78]

[Reps’95]Pros:• Increases available

information• Enables some optimization

across procedure boundaries

Cons:• Has to analyze the entire

program• Optimizations need to

respect the procedure boundary

if (EOB)

d=a*bfill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)

…=z

4 Princeton University

Velocity Compiler Research

Interprocedural Analysis & Interprocedural OptiInterprocedural Analysis & Interprocedural Opti

Interprocedural Analysis[Sharir’78] [Morel’78]

[Reps’95]Pros:• Increases available

information• Enables some optimization

across procedure boundariesCons:• Has to analyze the entire

program• Optimizations need to

respect the procedure boundary

• Most optimizations will still be intraprocedural

if (EOB)

d=zfill B

ret

z=x*yf(x,y,5,z) f(1,2,3,2)

g() h()f(a,b,c,z)

…=z

5 Princeton University

Velocity Compiler Research

InliningInliningInlining[Scheifler ‘77] [Hwu’89]

[Chang’92] Pros:• Increases optimization scope• Enables specialization• Doesn’t require opti to

understand interprocedural concerns

Cons:• Hard limit on procedure size• Unnecessary code growth

z=x*yg’()

if (EOB)

d=z

jump

fill Bif (EOB)

d=a*bfill B

ret

f(1,2,3)

h()

f(a,b,c)

…=z

6 Princeton University

Velocity Compiler Research

Partial InliningPartial Inlining

z=x*yg()

if (EOB)

d=z

f’()

jump

fill B

return

f’()

if (EOB)

d=a*bfill B

ret

f(1,2,3)

h()

f(a,b,c)

Partial Inlining[Suganuma’03][Way’00] Pros:• Can alleviate some code

growthCons:• Gains are limited

…=z

7 Princeton University

Velocity Compiler Research

Why Procedures?Why Procedures?

if (EOB)

fill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)

z=x*y

Procedures• Calling convention

boundary• Single-Entry, Single-ExitPros:• Implicit correlated edges

- context sensitivity• Natural unit for divide &

conquer compilationCons:• Optimized for software-

engineering• Restricts optimization

We don’t have to use procedures!

…=z

d=a*bd=z

8 Princeton University

Velocity Compiler Research

The Whole-Program CFGThe Whole-Program CFG

if (EOB)

d=a*bfill B

jump

z=x*y

Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-

program representation• Increase optimization scope• Allow all opti to operate on

increased scope without change• Targeted code growth

…=z

9 Princeton University

Velocity Compiler Research

The Whole-Program CFGThe Whole-Program CFG

B

C D

EHF

A G(1 (2

)1 )2

Represent calls and returns as special control-flow transitions [Sharir’78]

Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-

program representation• Increase optimization scope• Allow all opti to operate on

increased scope without change• Targeted code growth

10 Princeton University

Velocity Compiler Research

Whole-Program OptimizationsWhole-Program Optimizations

B

C D

EH

F’

A

G

(1

(2

)2

B

C’

E’

F

)1 )1

Optimization destroys the program’s procedural structure!

• Example: Superblock Formation [Hwu’92]

Unconventional call structures!

• Many-to-many call <-> return relation

• Must rediscover structure for summary edges

11 Princeton University

Velocity Compiler Research

Context-Sensitive Interprocedural Analysis

[Sharir‘78]meet over all realizable paths

Identify Entry-Exit Pairs (EEP):

• Correlated call & return arcs• Allows use of summary edges• Blocks may belong to more

than one EEP

Analyzing the Whole-Program CFGAnalyzing the Whole-Program CFG

B

C D

EH

F’

A

G

(1

(2

)2

B’

C’

E’

F

)1 )1

(BE) C’B’ E’

(BF’) CB’ E D

(B’F’) CB E D

12 Princeton University

Velocity Compiler Research

Determining a Compilation Unit: Region FormationDetermining a Compilation Unit: Region Formation

B

C D

EH

F’

A

G

(1

(2

)2

B’

C’

E’

F

)1 )1

Region Formation [Hank’95]• arbitrarily shaped,

compiler-selected compilation unit

Region Selection• Select seed & add neighbors

(profile, structure, dataflow …)

Success Criteria• Optimizability vs. compile

time• Few too small or too big

regions• Intra-region transitions »

inter-region transitions

Encapsulation• Make regions independently

optimizable

Compiler is free to select its own optimization units!

13 Princeton University

Velocity Compiler Research

Evaluation Framework: The Velocity CompilerEvaluation Framework: The Velocity CompilerFrontend

Procedures

Superblock

Classical & ILP

Optimizer

Executable

Baseline

Superblock

Executable

Procedures

Inlining

Frontend

Inlining

PBE

WCFG

Region Form.

Superblock

Regions

Executable

Procedures

Frontend

DetermineCompilation Unit

Optimize Compilation Unit

Classical& ILP

Optimizer

Classical& ILP

Optimizer

Evaluation:

•Inliner & Opti. ported from IMPACT

•Targeting Itanium 2

Procedures

Scheduling Scheduling Scheduling

14 Princeton University

Velocity Compiler Research

Code GrowthCode Growth

Code Size

0.90

1.10

1.30

1.50

1.70

1.90

124.m88ksim129.compress

164.gzip 179.art 181.mcf183.equake 188.ammp 256.bzip2 Geo Mean

Inlining PBE

1.45

1.23

15 Princeton University

Velocity Compiler Research

Speedup - Train InputSpeedup - Train Input

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

124.m88ksim129.compress

164.gzip179.art 181.mcf

183.equake 188.ammp 256.bzip2 Geo Mean

Inlining PBE

1.07

16 Princeton University

Velocity Compiler Research

ConclusionConclusion

Procedure boundaries restrict optimization!

Ways to deal with procedures exist, but limited• Interprocedural analysis & opti: Scales badly, not always

possible• Inlining: Unnecessary Code growth • Procedures are not the right compilation unit

PBE offers unrestricted and practical whole-program optimization

• An expanded form of interprocedural analysis• New region formation framework and heuristics• An interprocedural region encapsulation method

top related