1 Partitioning Loops with Variable Dependence Distances Yijun Yu and Erik D’Hollander Department of Electronics and Information Systems University of Ghent,

Partitioning Loops with Variable Dependence

Distances

Yijun Yu and Erik D’HollanderDepartment of Electronics and Information Systems

University of Ghent, Belgium

Introduction

1. Overview2. Dependence analysis:

pseudo distance matrix (PDM)

3. Loop transformations: unimodular and partitioning

4. Results5. Conclusion

1. Overview

• Loop with linear array subscripts• Solve dependence equation• Find all non-constant distances• Create maximally covering grid and base-vectors• Create the pseudo distance matrix, PDM

containing all base-vectors of the covering grid• Find independent loops or independent partitions,

based on the rank of PDM

Approach

Uniform or constantdistance

Variable or non-constantdistance

rank(H)<loop depth?Non-full rank

Full rank

Partitioning transformation

Unimodulartransformation

Dependence analysis:

det(H)> 1?

Loop parallelization

Linear dependence equation

Loop transformation:

2. Dependence Analysis

4I1-I2+3=J1+J2-1

2I1+I2-2=J1-J2+2

f(i)=g(j)

iA+a = jB+b

i=(I1,I2) j=(J1,J2)

A[f(I)]=……=A[g(I)]

L1: do I1= -N,N

L2: do I2= -N,N

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

i j A[f(i)]=A[g(j)] d=|j-i|(1, -5) (3, 10) A[12, -5] (2,15)(3, 0) (9, 7) A[15, 4] (6, 7)(-3,-3) (-9, 4) A[-6,-11] (6, -7)

The dependence distance

iA a jB b

| | | ( ) | | |r l d j i t U U tF

1. The linear dependence equation:

2. Using Banerjee’s unimodular transformation U to obtain an echelon matrix S, the equation t S=(b-a) is solved, yielding:

3. The distance between dependent iterations i, j is:

• Ul and Ur are left, right halves of U• t has constant part t1 and unknown part t2

The distance set

1 2where const, variable 1 2t t t t t

Fd tF F

1. From the dependence equation t S = (b-a), the solution vector t contains a constant and an arbitrary part:

1 '| where '' or

t Fd xR d 0 x R F

2. Matrix F=Ur-Ul can be vertically separated into two sub-matrices:

3. The distance set of the dependence equations is:

Distances in the iteration space

• Iteration-space (i1,i2) of loop 1 with dep. eqns:4I1-I2+3=J1+J2-1

2I1+I2-2=J1-J2+2. • The arrows

(I1,I2)(J1,J2)represent the distance vectors between dependent iterations.

Distances base vectors

1. The dependence distance is non-constant for the reference pair, e.g. (2,15),(6,7),(6,-7), as highlighted.

2. However, the distance set is spanned by the grid generated by the base vectors(2,1) and (0,2).

3. For example, (2,15) = (2,1) + 7 (0,2),(6, 7) = 3(2,1) + 2 (0,2),(6, -7) = 3(2,1) - 5 (0,2).

The largest base vectorsThe distance set is the linear combination of the row vectors in R:

A lattice L(R) is a group of vectors generated by all the linear combinations of the independent row vectors of a matrix R.

We look for the smallest lattice L(R) (generating the largest grid) which covers the whole distance set:

In this way, possible spurious dependencies introduced by replacing the distance set with a lattice are minimized.

( )L R

( ) | iL R xR x

| i d xR d 0 x

Pseudo Distance Matrix (PDM)• A Hermite normal form HNF(R) is a full row rank matrix reduced

from the echelon form of R by unimodular transformation.

• Therefore H generates the same lattice as R does, that is, the smallest lattice. In addition, the HNF rows are base vectors.

• H is called the pseudo distance matrix (PDM), because it generates the distance set from its row vectors.

• Since the row vectors of H are constant, the techniques from the uniform distance dependence matrix may apply.

H = HNF (R)

L (H) = L (R)

Calculating the PDM

1 14 4

1. Solving the linear dependence equations:

2. Expressing the distance set:

3. Finding the largest base vectors:0 4

2 12 1

0 20 2

2 1PDM

1 2 1 24 2 3 1 1 1

1 1 2 1 1 2i i j j

HNF( )H R

3. Loop transformations: unimodular and partitioning

LegalityAny transformation should be legal, i.e.preserve the executing order of dependent iterations.

Transformations depending on rank(H):3.1 Unimodular transformation: non-full rank PDM3.2 Partitioning transformation: full rank PDM3.3 Combined approach

3.1 Unimodular transformation

• Given a non-full rank (r m) pseudo distance matrix H, a unimodular matrix T can be developed such that the first m-r columns of HT are zero.

• As a result, m-r outermost loops can be parallelized.

3.2 Partitioning transformation

• Given a full rank pseudo distance matrix H, the loop nest can be partitioned such that det(H) partitions are found.

• The partitioned parallelism is det(H).

3.3 Combined approach

• After a unimodular transformation on a non-full rank PDM, the transformed PDM matrix has a full rank sub-matrix, S.

• When the det(S)>1, additional parallelism can be found using loop partitioning transformation.

L’1: doall J1=-2N,2N

L’2: do J2=max(-N,-N-J1),

min(N,N-J1)

I2=J1+J2

A(3I1+1,2I1+I2-1)=…

…=A(I1+3,I2+1) enddoenddoall

4. Results (1) Non-full rank PDM

PDM=(2,2) (2,0) (0,2)

L1: do I1=-N,N

L2: do I2=-N,N

A(3I1+1,2I1+I2-1)=…

…=A(I1+3,I2+1) enddoenddo

1 1 0 1 1 1

0 1 1 0 1 0

NF-rank: Dependence graphsj1

4. Results(2) partitioning

L1: do I1=-N,N

L2: do I2=-N,N

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

L’1: doall Io1=0,1

L’2: doall Io2=0,1

L’3: do I1=-N+mod(N+Io1,2), N-mod(N-Io1,2),2

io’2=Io2+(I1-Io1)/2

L’4: do I2=-N+mod(N+Io’2,2),

N-mod(N-Io’2,2),2

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

enddoall enddoall

1/ 2 0

0 1/ 2

2 1PDM

F-rank partitioning: dependence graphs

4. Results (3) Combined

PDM=(2,2) (0, 2)

L’1: doall J1=-2N,2N

L’2: do J2=max(-N,-N-J1), min(N,N-J1)

I2=J1+J2

A(3I1+1,2I1+I2-1)=…

(0, 1)

L’’1: doall Jo2=0,1

L’’2: doall J1=-2N,2N

p2=max(-N,-N-J1)

q2=min(N,N-J1)

L’’3: do J2=p2+mod(Jo2-p2,2),

q2-mod(q2-Jo2,2),2 I1=J2

I2=J1+J2

A(3I1+1,2I1+I2-1)=…

enddoall

0 1/ 2

det 2 det 1

F-rank submatrix dependence graphj1

5. Conclusion

• The distances of the dependent iterations are non-constant when the array subscripts are linear.

• A pseudo distance matrix(PDM) with the largest base vectors of the distance space is computed from the linear dependence equations.

• Parallelism can still be exploited for these loops with variable distances by the unimodular and partitioning transformations that are derived from the PDM.

1 Partitioning Loops with Variable Dependence Distances Yijun Yu and Erik D’Hollander Department of Electronics and Information Systems University of Ghent,

distance vectors

pseudo distance matrix

i2 of loop

nonconstant distance

row vectors of h

matrix r

dependence analysisl1

t2the distance set1

Documents

GHENT NEIGHBORHOOD LEAGUE · Ghent Kayak Launch . . . . ......

Tranvía Ghent

Ghent belgium

Ghent 2013

Digital Print PR - Ghent Workgroup NEWS RELEASE Ghent...

Practical Designs of Brain-Computer Interfaces Based on the....

Yijun Senior School Li Juan young energetic amusing.

High Capacity WDM Optical Fibre Communication -YIJUN SHAN

GHENT UNIVERSITY FACULTY OF PHARMACEUTICAL...

Georges D’hollander Maj Gen, BEAR Director, NHQ C3 Staff,....

3B M4U3 Three Little Pigs School: Meilong Primary School...

FOCUS ON GHENT UNIVERSITY · FOCUS ON GHENT UNIVERSITY...

Using the Iteration Space Visualizer in Loop Parallelization...

ECER 2007 Ghent

1 - meeting 08 July – 12 July 2013 Ghent (Belgium) Host:.....

Apps 4 ghent