Top Banner
A Framework for Unrestricted A Framework for Unrestricted Whole-Program Optimization Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges , Easwaran Raman, Guilherme Ottoni, David I. August The Liberty Research Group Department of Computer Science Princeton University
16

A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

Mar 26, 2015

Download

Documents

Nicholas Rowe
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

A Framework for UnrestrictedA Framework for UnrestrictedWhole-Program OptimizationWhole-Program Optimization

Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I.

August

The Liberty Research GroupDepartment of Computer Science

Princeton University

Page 2: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

2 Princeton University

Velocity Compiler Research

Procedure-Based CompilationProcedure-Based Compilation

if (EOB)

d=a*bfill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)Procedure-Based

CompilationPros:• Well KnownCons:• Can not exploit

opportunities that cross procedures

…=z

Page 3: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

3 Princeton University

Velocity Compiler Research

Interprocedural AnalysisInterprocedural Analysis

Interprocedural Analysis[Sharir’78] [Morel’78]

[Reps’95]Pros:• Increases available

information• Enables some optimization

across procedure boundaries

Cons:• Has to analyze the entire

program• Optimizations need to

respect the procedure boundary

if (EOB)

d=a*bfill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)

…=z

Page 4: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

4 Princeton University

Velocity Compiler Research

Interprocedural Analysis & Interprocedural OptiInterprocedural Analysis & Interprocedural Opti

Interprocedural Analysis[Sharir’78] [Morel’78]

[Reps’95]Pros:• Increases available

information• Enables some optimization

across procedure boundariesCons:• Has to analyze the entire

program• Optimizations need to

respect the procedure boundary

• Most optimizations will still be intraprocedural

if (EOB)

d=zfill B

ret

z=x*yf(x,y,5,z) f(1,2,3,2)

g() h()f(a,b,c,z)

…=z

Page 5: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

5 Princeton University

Velocity Compiler Research

InliningInliningInlining[Scheifler ‘77] [Hwu’89]

[Chang’92] Pros:• Increases optimization scope• Enables specialization• Doesn’t require opti to

understand interprocedural concerns

Cons:• Hard limit on procedure size• Unnecessary code growth

z=x*yg’()

if (EOB)

d=z

jump

fill Bif (EOB)

d=a*bfill B

ret

f(1,2,3)

h()

f(a,b,c)

…=z

Page 6: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

6 Princeton University

Velocity Compiler Research

Partial InliningPartial Inlining

z=x*yg()

if (EOB)

d=z

f’()

jump

fill B

return

f’()

if (EOB)

d=a*bfill B

ret

f(1,2,3)

h()

f(a,b,c)

Partial Inlining[Suganuma’03][Way’00] Pros:• Can alleviate some code

growthCons:• Gains are limited

…=z

Page 7: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

7 Princeton University

Velocity Compiler Research

Why Procedures?Why Procedures?

if (EOB)

fill B

ret

z=x*yf(x,y,5) f(1,2,3)

g() h()f(a,b,c)

z=x*y

Procedures• Calling convention

boundary• Single-Entry, Single-ExitPros:• Implicit correlated edges

- context sensitivity• Natural unit for divide &

conquer compilationCons:• Optimized for software-

engineering• Restricts optimization

We don’t have to use procedures!

…=z

d=a*bd=z

Page 8: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

8 Princeton University

Velocity Compiler Research

The Whole-Program CFGThe Whole-Program CFG

if (EOB)

d=a*bfill B

jump

z=x*y

Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-

program representation• Increase optimization scope• Allow all opti to operate on

increased scope without change• Targeted code growth

…=z

Page 9: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

9 Princeton University

Velocity Compiler Research

The Whole-Program CFGThe Whole-Program CFG

B

C D

EHF

A G(1 (2

)1 )2

Represent calls and returns as special control-flow transitions [Sharir’78]

Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-

program representation• Increase optimization scope• Allow all opti to operate on

increased scope without change• Targeted code growth

Page 10: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

10 Princeton University

Velocity Compiler Research

Whole-Program OptimizationsWhole-Program Optimizations

B

C D

EH

F’

A

G

(1

(2

)2

B

C’

E’

F

)1 )1

Optimization destroys the program’s procedural structure!

• Example: Superblock Formation [Hwu’92]

Unconventional call structures!

• Many-to-many call <-> return relation

• Must rediscover structure for summary edges

Page 11: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

11 Princeton University

Velocity Compiler Research

Context-Sensitive Interprocedural Analysis

[Sharir‘78]meet over all realizable paths

Identify Entry-Exit Pairs (EEP):

• Correlated call & return arcs• Allows use of summary edges• Blocks may belong to more

than one EEP

Analyzing the Whole-Program CFGAnalyzing the Whole-Program CFG

B

C D

EH

F’

A

G

(1

(2

)2

B’

C’

E’

F

)1 )1

(BE) C’B’ E’

(BF’) CB’ E D

(B’F’) CB E D

Page 12: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

12 Princeton University

Velocity Compiler Research

Determining a Compilation Unit: Region FormationDetermining a Compilation Unit: Region Formation

B

C D

EH

F’

A

G

(1

(2

)2

B’

C’

E’

F

)1 )1

Region Formation [Hank’95]• arbitrarily shaped,

compiler-selected compilation unit

Region Selection• Select seed & add neighbors

(profile, structure, dataflow …)

Success Criteria• Optimizability vs. compile

time• Few too small or too big

regions• Intra-region transitions »

inter-region transitions

Encapsulation• Make regions independently

optimizable

Compiler is free to select its own optimization units!

Page 13: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

13 Princeton University

Velocity Compiler Research

Evaluation Framework: The Velocity CompilerEvaluation Framework: The Velocity CompilerFrontend

Procedures

Superblock

Classical & ILP

Optimizer

Executable

Baseline

Superblock

Executable

Procedures

Inlining

Frontend

Inlining

PBE

WCFG

Region Form.

Superblock

Regions

Executable

Procedures

Frontend

DetermineCompilation Unit

Optimize Compilation Unit

Classical& ILP

Optimizer

Classical& ILP

Optimizer

Evaluation:

•Inliner & Opti. ported from IMPACT

•Targeting Itanium 2

Procedures

Scheduling Scheduling Scheduling

Page 14: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

14 Princeton University

Velocity Compiler Research

Code GrowthCode Growth

Code Size

0.90

1.10

1.30

1.50

1.70

1.90

124.m88ksim129.compress

164.gzip 179.art 181.mcf183.equake 188.ammp 256.bzip2 Geo Mean

Inlining PBE

1.45

1.23

Page 15: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

15 Princeton University

Velocity Compiler Research

Speedup - Train InputSpeedup - Train Input

0.95

1.00

1.05

1.10

1.15

1.20

1.25

1.30

124.m88ksim129.compress

164.gzip179.art 181.mcf

183.equake 188.ammp 256.bzip2 Geo Mean

Inlining PBE

1.07

Page 16: A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I. August The.

16 Princeton University

Velocity Compiler Research

ConclusionConclusion

Procedure boundaries restrict optimization!

Ways to deal with procedures exist, but limited• Interprocedural analysis & opti: Scales badly, not always

possible• Inlining: Unnecessary Code growth • Procedures are not the right compilation unit

PBE offers unrestricted and practical whole-program optimization

• An expanded form of interprocedural analysis• New region formation framework and heuristics• An interprocedural region encapsulation method