06/16/22 \course\cpeg421-08s\Topic-7a.ppt 1 “Rate-Optimal” Resource-Constrained Software Pipelining
Feb 25, 2016
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 1
“Rate-Optimal” Resource-Constrained
Software Pipelining
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 2
Objectives
• Establish good boundsHelp compiler writers
Help architects
• Study pragmatic issues when
implemented as an option of compilers
• Study its payoffs
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 3
A Motivating Example for i = 0 to n
do
0: a [i] = X + d [i - 2];
1: b [i] = a [i] * F + f [i - 2] + e [i - 2];
2: c [i] = Y - b [i];
3: d [i] = 2 * c [i];
4: e [i] = X - b [i];
5: f [i] = Y + b [i]
end
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 4
L: for ( i = 0; i < n; i ++) {S0 : a [i] = X + d [i - 2];S1 : b [i] = a [i] * F + f [i - 2] + e [i - 2];S2 : c [i] = Y - b [i];S3 : d [i] = 2 * c [i];S4 : e [i] = X - b [i];S5 : f [i] = Y + b [i]
}
Program Representation
0 3
1 2
4 5
Data Dependence Graph
2
2 2
Assume all statements have a delay = 1
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 5
Question
• How fast can L run (I.e.optimal computation
rate) without resource constraints?
• How many FUs it needs minimally to achieve
the optimal rate?
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 6
Schedule A: t (i, Sj) = 2i + tsj
where: ts0 = 0 ts1 = 1 ts2 = ts4 = ts5 = 2 ts3 = 3
. . .
iter #1 #2 #3 . . .0 S0
1 S1
2 S2, S4, S5 S0
3 S3 S1
4 S2, S4, S5 S0
5 S3 S1
6 S2, S4, S5
7 S3
Schedule A : # of FUs = 4
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 7
Can we do better?
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 8
Schedule B : t (i, Sj) = 2i + tsj
where: ts0 = 0 ts1 = 1 ts2 = ts4 = 2 ts3 = ts5 = 3
. . .
iter #1 #2 #3 . . .0 S 0
1 S 1
2 S 2 , S4 S 0
3 S 3, S 5 S 1
4 S 2 , S4 S 0
5 S 3, S 5 S 1
6 S 2 , S4
7 S 3, S 5
Schedule B : # of FUs = 3
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 9
Problem Statements
Problem 0: Given a loop L, determine a “rate-optimal” schedule for L which uses minimum # of FUs.
Problem I (FIxed Rate SofTware Pipelining with minimum Resource - FIRST):
Given a loop L and a fixed initiation rate determine a schedule for L which uses minimum # of FUs.
Problem II (REsource Constrained SofTware Pipelining - REST)
Given a loop L and the # of Fus, determine an optimal schedule which can run under the given resource constraints.
Assumptions: homogeneous function units
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 10
Problem Formulation of
FIRST
How to formulate resource
constraints into a linear form?
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 11
The SWP kernel:--- The “frustum”
R = Max (width)
Subject to:• all dependence constraints• R is the maximum width• for a fixed rate (or period)
Min R
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 12
Frustum A
Now R = max arii
n
0
1
r [ 0, II - 1]
Contribution of node i to step r’s resource requirement
Min (max ari) R
itimer
...
0 1 2 . . . N-10 11 12 13 1
II-1 1
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 13
FIRST: A Linear Programming Formulation Example
So for node I:
ti = ki II + (aoi , a1i ) * 01
Where ari = 1 means node i start at step r = 0 otherwise
0 1 2 3 4 5a00 a01 a02 a03 a04 a05
a10 a11 a12 a13 a14 a15
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 14
R a R a R aii
N
ii
N
II ii
N
00
1
0
1
10
1
0 1 0 0, ,......, ( )
II K A II [ , ,... ]0 1 1
arii
II
10
1
]1,0[ Ni
t t II mj i ij 1 ( , )i j E
ti ki 0 0, ari 0 i rN II[ , ], [ , ]0 1 0 1and are integers
Minimize RSubject to
T T
R: Max
Periodic
DependenceConstraints
= T
FIRST: A Linear Programming
Formulation
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 15
Example RevisitedR a a a a a aR a a a a a a
( )( )
00 01 02 03 04 05
10 11 12 13 14 15
00
11012102
1101
010000
taaktaak
2 0 15 05 15 5 k a a t
..
t t1 4 3
t t1 5 3
t t1 0 1 ..
a aa a
a a
00 10
01 11
05 15
11
1
Min RSubjected to:
t k ai ri1 0 0 0 , ,
..
..
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 16
SolutionSchedule C t(i, s) = 2i + ts
where t0 = 0 t1 = 1 t2 = 2 t3 = 3 t4 = 3 t5 =2
so, # of FUs = 3!
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 17
Linear Programming• A Linear Program is a problem that can be expressed in the
following form: • minimize cx • subject to
• Ax = b• x >= 0
where x: vector of variables to be solved A: matrix of known coefficients c, b: vectors of known coefficientscx: it is called the objective function Ax=b is called the constraints
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 18
Linear Programming
• “programming” actually means “planning” here.
• Importance of LP
- many applications
- the existence of good general-purpose
techniques for finding optimal solutions
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 19
Integer Linear Programming(ILP)
• Integer programming have proved valuable for modeling many and diverse types of problems in :- planning,- routing, - scheduling, - assignment, and - design.
• Industry applications:- transportation, energy, telecommunications, and
manufacturing
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 20
Integer Linear Programming• Classification:
- mixed integer (part of variables)- pure integer (all variables)- zero-one (only takes 0 or 1)
• Much harder to solve• Many existing tools and product
- solve the LP/ILP problem and get optimal solution
- help to model the problem, and then solve the problem
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 21
Quiz
How to handle cases
where di 1 for node i
?
>=
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 22
The “Trick”
At time step r, the # of FUs required
ril
di
0
1
Note: if actor i requires a FU at time r or 0 otherwise.So objective function becomes:
ri 1
Min rii
N
0
1Max
r II[ , ]0 1
Contributed by node i{Started at steps in [0, (t + di-1)%II]}a((r - 1)%II)i
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 23
How to extend the FIRST formulation to heterogeneous
function units?
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 24
Heterogeneous Function Units
Solution Hints: “weighted Sum”!
r II[ , ]0 1
ri
N
i
0
1
Mk = max
and so, the objective should be
min Ck Mkk
h
0
1
For FU of type k:
Where h is the total # of FU types
k h r II[ , ], [ , ]0 1 0 1rii
n
0
1
Mk - 0 FU(i) = k
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 25
Solution of the REST
Resource - constraint rate-
optimal software pipelining
problem
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 26
Hints
Based on scheme for
“FIRST”
and play
“trial-and improve”
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 27
Solution Space for REST
MII = max { RecMII, ResMII}1 2
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 28
Note: The bounds of initiation interval
1. Loop-carried dependence constraints:
2. Resource constraints
As a result:
RecMII = Maxcycles C
d(C)m(C)
ResMII = # of nodes# of FUs
MII = max {RecMII, ResMII }
04/22/23 \course\cpeg421-08s\Topic-7a.ppt 29
Other Work• Extend the technique to “more benchmarks” and establish
“bounds”
• Extend the framework to include register constraints
• Extend the model for pipelined architectures (with structural
“hazards”)
• Extend the model to consider superscalar and multithreaded
with dynamic scheduling support.