A Dynamic Programming Approach to Optimal Integrated Code Generation Christoph Keßler Andrzej Bednarski Linköping University (Sweden)
Dec 22, 2015
A Dynamic Programming Approach to Optimal
Integrated Code Generation
Christoph KeßlerAndrzej Bednarski
Linköping University (Sweden)
Outline Code generation Our integrated approach Implementation and results Current and future work Conclusion
Code Generation
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
IR
Targetcode
IR-levelInstruction scheduling
Target-levelInstruction scheduling
IR-levelInstruction scheduling
Target-levelInstruction scheduling
Targ
et-le
vel
Reg. A
lloc
Targ
et-le
vel
Reg. A
lloc
IR-le
vel
Reg. A
lloc
IR-le
vel
Reg. A
lloc
Related Work Heuristics Optimal approaches
ILP Dynamic programming Branch-and-bound Enumeration Constraint logic programming
Integrated Code Generation
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
Instru
ctio
n s
ele
ctio
n
IR
Targetcode
IR-levelInstruction scheduling
Target-levelInstruction scheduling
IR-levelInstruction scheduling
Target-levelInstruction scheduling
Targ
et-le
vel
Reg. A
lloc
Targ
et-le
vel
Reg. A
lloc
IR-le
vel
Reg. A
lloc
IR-le
vel
Reg. A
lloc
Integrated
Code generation
Integrated Approach Christoph Keßler’s previous work
Scheduling by topological sorting Dynamic programming Selection DAG
Time profile Extended selection DAGBasic block scope of code generation
Topological Sorting
u
v
scheduled(z)
z
scheduled(z’)
u
v
z’
Selection Tree
h
f g
d e
a b c
{a,b,c}
{b,c} {a,c} {a,b}
a cb
{c,d} {b} {c,d} {a,e} {b} {a,e}
… … … … … …
b c a c a b
Selection DAG Merge multiple instances of same
zero indegree set z in one selection node
Selection DAG Selection DAG is leveled in n+1 levels Each schedule S corresponds to one
path in the selection DAG
Selection DAG
h
f g
d e
a b c
{a,b,c}
{b,c} {a,c} {a,b}
a cb
{c,d} {b} {a,e}
… … …
b c ca ab
Towards Time Optimization Machine model
Generic superscalar/VLIW architecture Single/Multiple issue
From IR level to target level Instruction selection Register allocation (homogenous) Imitate instruction dispatcher
behaviour
Time Profile Window of the instructions scheduled
last for each unit that may still influence future scheduling decisions
e f -
c
t
d
- -
--a
b
-
time
u3u2u1
Extended Selection Node An extended selection node (z, t,
P), summarizes all schedules of scheduled(z) that end with the time profile (t, P).
Pruning (formal proof in the paper)time e f -
c
t
d
- -
--a
b
-
u3u2u1
e f -
c
t’
d
- -b
a
u3u2u1
a f -
c
t’
d
- -b
e
u3u2u1
Extended Selection DAGLevel 0
Level 1
Level 2
...
Solution Space Group the extended
selection nodes in each level according to execution time
Construct solution space in order of increasing time
Postpones the combinatorial explosion
Implementation C++ LEDA XML based architecture description
language LCC as C–front-end
Results – Random DAGs
Results – Random DAGs
Results – FIR Filter
Basic Block
DAG#node
sTime
archi. 1Time
archi. 2
BB1 16 3.5s 4.0s
BB2 16 8.0s 9.5s
BB3 303:21:50.
2s4:40:44.9
s
Results – Matrix Multiplication
Basic Block
DAG#node
sTime
archi. 1Time
archi. 2
BB2 30 1:05.0s 1:41.8s
BB2 (unrolle
d)40 6:08.5s 9:47.2s
Results – Jacobi Grid Relax.
Basic Block
DAG#node
sTime
archi. 1Time
archi. 2
Loop body (5)
40 1:15.8s 1:31.8s
Loop body (9)
531:36:13.2
s2:00:51.5
s
Current and Future Work Time-space profile for irregular
register sets Speculative instruction selection Extensions of architecture
description language Beyond basic block level
Time-space profiles as connector descriptions
Conclusion Goal: fully integrated code generation Dynamic programming approach Time profiles to compress the solution space Improved order of solution space construction Feasible for medium sized basic blocks Potential for extensions Alternative to ILP Home page: www.ida.liu.se/~chrke/optimist