Top Banner
School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University Carnegie Mellon University {dkoes,seth}@cs.cmu.edu
26

School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

School of Computer Science

A Global Progressive Register Allocator

A Global Progressive Register Allocator

David Ryan KoesSeth Copen GoldsteinCarnegie Mellon UniversityCarnegie Mellon University

{dkoes,seth}@cs.cmu.edu

Page 2: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

2School of Computer Science

Register Allocation ProblemRegister Allocation Problem

v = 1

w = v + 3

x = w + v

u = v

t = u + x

print(x);

print(w);

print(t);

print(u);

registerregisterallocatorallocatorregisterregisterallocatorallocator

unbounded number of unbounded number of program variablesprogram variables

limited number of limited number of processor registers + processor registers + slow memoryslow memory

eaxebxecxedxesiedi

ebpesp

spill code optimizationspill code optimizationspill code optimizationspill code optimization

memory operandsmemory operandsmemory operandsmemory operands

register preferencesregister preferencesregister preferencesregister preferencesrematerializationrematerializationrematerializationrematerialization

live range splittinglive range splittinglive range splittinglive range splitting

Page 3: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

3School of Computer Science

A More Principled Register AllocatorA More Principled Register Allocator– fully utilize machine description

• explicit and expressive model of costs of allocation for given architecture

– optimal solutions

reg allocreg

alloc

machine description

Page 4: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

4School of Computer Science

Multi-commodity Network Flow: An Expressive ModelMulti-commodity Network Flow: An Expressive Model

Given network (directed graph) with– cost and capacity on each edge– sources & sinks for multiple commodities

Find lowest cost flow of commodities

NP-complete for integer flows

Example:edges have unit capacity

a b

a b

01

Page 5: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

5School of Computer Science

Variables Commodities

Variable Definition Source

Variable Last Use Sink

Nodes Allocation Classes (Reg/Mem/Const)

Registers Limits Node Capacities

Spill Costs Edge Costs

Allocation Flow

Register Allocation as a MCNFRegister Allocation as a MCNF

a

a

r0 r1 mem 1

r1 mem 1

r0 r1 mem 1

3

Also need Also need anti-variablesanti-variables to to model persistent memorymodel persistent memoryAlso need Also need anti-variablesanti-variables to to model persistent memorymodel persistent memory

Page 6: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

6School of Computer Science

ExampleExampleSource Codeint example(int a, int b){ int d = 1; int c = a - b; return c+d;}

Pre-alloc AssemblyMOVE 1 -> dSUB a,b -> cADD c,d -> cMOVE c -> r0

load cost

insn pref cost

mem access cost

Page 7: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

7School of Computer Science

Control FlowControl FlowMCNF can only represent straight-line code

– need to link together networks from basic blocks

a: %eaxa: %eax

a: %eaxa: %eaxa: %eaxa: %eax

a: mema: mem

a: mema: mema: mema: mem

a: mema: mem

New nodes to handle block entry/exit constraints

Normal

ini outi

Merge

ini out,i

Split

in outi,i

Page 8: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

8School of Computer Science

A More Principled Register AllocatorA More Principled Register Allocator– fully utilize machine description

• explicit and expressive model of costs of allocation for given architecture: Global MCNF

– optimal solutions• NP-hard, so use progressive

solution technique

Compile Time

Allo

catio

n Q

ualit

y

Lagrangian relaxation directed allocatorsLagrangian relaxation directed allocators

Technique:Technique:

reg alloc

reg alloc

machine description

Page 9: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

9School of Computer Science

Solution ProcedureSolution ProcedureCompute Lagrangian prices using iterative

subgradient optimization– guaranteed converge to “optimal” prices

• for linear relaxation of the problem

Prices used by allocator to find solution– solution improves as prices converge– two allocators

• iterative heuristic allocator• simultaneous heuristic allocator

Page 10: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

10School of Computer Science

Solution ProcedureSolution ProcedureAdvantages

+ iterative nature progressive+ Lagrangian relaxation theory provides means

for computing a good lower bound+ Can compute optimality bound

Disadvantages– No guarantee of finding optimal solution– Optimality bound poor if integrality gap large

99% of the time 99% of the time integrality gap = 0integrality gap = 099% of the time 99% of the time integrality gap = 0integrality gap = 0

Page 11: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

11School of Computer Science

Iterative Heuristic AllocatorIterative Heuristic AllocatorAllocation order:

a, b, c, d

Cost:

a

0

b

4

c

0

d

-2

Total: 22

Edges to/from memory cost 3

Page 12: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

12School of Computer Science

Simultaneous Heuristic AllocatorSimultaneous Heuristic Allocator

XX XX

Current cost:-1-1-3-3-2-2

Edges to/from memory cost 3

Page 13: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

13School of Computer Science

EvaluationEvaluationImplemented in gcc 3.4.3 targeting x86

Optimize for code sizecode size– perfect static evaluation– important metric in its own right

MediaBench, MiBench, Spec95, Spec2000– over 10,000 functions

Page 14: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

14School of Computer Science

ProgressivenessProgressiveness

CPLEX

default allocator: 1121graph allocator: 1422

Page 15: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

15School of Computer Science

ProgressivenessProgressiveness

graph allocator

default allocator

CPLEX

Page 16: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

16School of Computer Science

Code SizeCode Size

Progressive!

Page 17: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

17School of Computer Science

OptimalityOptimality

Proven optimality

Proven maximum distance from optimal

Page 18: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

18School of Computer Science

10x slower

Compile Time Slowdown :-(Compile Time Slowdown :-(

Page 19: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

19School of Computer Science

A More Principled Register AllocatorA More Principled Register Allocator– fully utilize machine description

• explicit and expressive model of costs of allocation for given architecture: Global MCNF

– optimal solutions• approach optimality using

progressive solution technique: Lagrangian directed allocators

reg allocreg

alloc

machine description

Page 20: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

20School of Computer Science

Questions?Questions?

?

Page 21: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

21School of Computer Science

Page 22: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

22School of Computer Science

0%

10%

20%

30%

40%

50%

60%

<10%

10–5%

5–2%

2–1%

1–0% 0%

0–1%

1–2%

2–3%

3–4%

4–5%

5–10%

10–100%

>100%

Percent predicted size larger than actual size

Perc

en

t of

fun

ctio

ns

Accuracy of the ModelAccuracy of the ModelGlobal MCNF model correctly predicts costs of register allocation within 2% for 71% of functions compiled

Page 23: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

23School of Computer Science

Code SizeCode Size

Page 24: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

24School of Computer Science

Compile Time Asymptotic ComplexityCompile Time Asymptotic Complexity

one iteration: O(nv)

Page 25: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

25School of Computer Science

Code PerformanceCode Performance

Page 26: School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University {dkoes,seth}@cs.cmu.edu.

26School of Computer Science

Compile Time Slowdown :-(Compile Time Slowdown :-(

10x slower