CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1
1
CR18: Advanced Compilers
L02: Dependence Analysis
Tomofumi Yuki
2
Today’s Agenda Legality of Loop Transformations
Dependences Legality of loop parallelization Legality of loop permutation
Dependence Tests How to find dependences? Conservative tests Exact methods
Polyhedral Representations
Loop Parallelism “Simple” transformation
Not so simple to reason about Legality Performance impacts
More complicated cases Transform the loops to expose
parallelism
3
for (i=0; i<N; i++) S;
forall (i=0; i<N; i++) S;
4
Legality of Transformations First Rule of Compiler
preserve original semantics Many complications
loops parameters array accesses branches pointers random numbers
regular subset
5
Preserving Semantics Preserving the order of operations
one “easy” way to ensure preservation dependence is a partial order
Exceptions?
Dependences Express relations between statements
flow (true) dependence RAW
anti-dependence WAR
output dependence WAW
input dependence RAR
6
a = ...... = a
... = aa = ...
a = ...a = ...
... = a
... = a
7
Flow vs Anti Dependence Why is flow the “true” dependence?
Flow is value-based
Anti is memory-based
for i a[i] = ... ... = a[i]
for i ... = a[i] a[i] = ...
for i ... = a[i] b[i] = ...
Dependence Abstractions Distance Vector
distance between write and read [i,j] + c e.g., [0,1]
Direction Vector direction of the instance that uses the
value one of <, >, ≤, ≥, =, *
e.g., [0,<] less precise, but sometimes sufficient
8
Direction Vector Example 1
9
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i][0] + B[i][j];
i
j
distance vector [0,1], [0,2], [0,3]
direction vector [0,<]
Direction Vector Example 2
10
for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j];
i
j
distance vector [1,-1]
direction vector [<,>]
So what does these vectors do? Parallelism is clear
same for direction vectors Loop carried-dependence
loop at depth d carries a dependence if at least one of the distance/direction vectors have non-zero entry at d
11
[0,0,1][0,1,0][1,1,0]
[0,0, 1][0,1, 1][0,1,-1]
[1, 0,0][1, 1,0][1,-1,0]
12
Loop Carried Dependence Is any of the loops parallel?
What are the distance vectors?
for i for j A[j] = foo(A[j], A[j+1])
Legality of Loop Permutation Another application of distance vectors Which ones can you permute?
13
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j];
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j];
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j];
[1,1]
[0,1]
[1,-1]
Legality of Loop Permutation Another application of distance vectors Which ones can you permute?
14
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j];
[1,1]
i
j
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i+1][j-1] + B[i][j];
Another application of distance vectors Which ones can you permute?
Legality of Loop Permutation
15
i
j
[0,1]for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j];
Another application of distance vectors Which ones can you permute?
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j];
[1,-1]
Legality of Loop Permutation
16
i
j
Fully permutable: [≤,...,≤]
17
Legality of Loop Reversal Is this transformation legal?
for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j];
for (i=1; i<N; i++) for (j=M-1; j>0; j--) A[i][j] = A[i-1][j+1] + B[i][j];
[1,-1]
[?,?]
18
Today’s Agenda Legality of Loop Transformations
Dependences Legality of loop parallelization Legality of loop permutation
Dependence Tests How to find dependences? Conservative tests Exact methods
How to Find the Vectors Easy case
Not too easy
19
for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j];
for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[i] + B[i][j];
for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[2*i-j+3] + B[i][j];
How to Find the Vectors Really difficult
No general solution polynomial case is undecidable can work for linear accesses wide range of precise-ness even for
linear case
20
for (i=1; i<N; i++) for (j=0; j<M; j++) { A[i*i+j*j-i*j] = A[i] + B[i][j]; A[i*j*j-i*j*3] = A[i] + B[i][j]; }
Dependence: Affine Case Given two accesses f(i,j) and g(x,y)
the two accesses are in conflict if: same location: f(i,j) = g(x,y) one of them is a write
Let f and g be affine a0+a1i+a2j = b0+b1x+b2y
The last write to a conflicting location is the producer
21
22
It is just solving a linear system Theoretically it is not that “hard”
Two Directions Polyhedral: use PIP and get exact
solution
Others: less expensive solutions work in practice
Exact Method: Polyhedral Model Array Dataflow Analysis [Feautrier 1991]
Given read and write statement instances r,w
Find w as a function of r such that r and w are in conflict w happens-before r w is the most recent write when everything is affine
Main Engine Parametric Integer Linear Programming
23
Exact Dependence Analysis Who produced the value read at A[j]?
Powerful but expensive
24
for (i=0; i<N; i++) for (j=0; j<M; j++)S: A[i] = A[j] + B[i][j];
S<i,j> = if i>j and j>0 : S<j,M-1>; if i=j and i>0 : S<j,j-1>; if j>i or i=j=0: A[j];
0≤i,i’<N0≤j,j’<Mi=j’
(i’,j’)<<(i,j)
obj:max i’*X+j’
25
ADA Example 1 What is the PIP problem?for (i = 0; i<=N; i++) for (j = i; j<=M; j++) A[j] = foo(A[j], A[j+1])
26
ADA Example 2 What is the PIP problem?for (i = 0; i<=N; i++) B[j] = foo(...);
for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);
27
Digression: Multiple Statements Within a domain, the order of execution
is given by lex. order
What do you do when you have multiple statements?
28
2d+1 Notation A convention to encode statement
ordering Called in many different names in the original ADA paper, it simply said
to:“use the textual order”
For a d-dimensional loop nest, use d+1 constant dimensionsfor i for j S1<i,j>; for j S2<i,j>; S3<i,j>;
dom(S1) = {0,i,0,j,0| ...}dom(S2) = {0,i,1,j,0| ...}dom(S3) = {0,i,1,j,1| ...}
29
ADA Example 2 What is the PIP problem?for (i = 0; i<=N; i++) B[j] = foo(...);
for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);
30
ADA Example 3 What is the PIP problem?for (t=0; t<=T; t++) { for (i=0; i<=N; i++) A[i] = foo(B[j]); for (j=0; j<=M; j++) B[j] = foo(A[i]);}
31
The Omega Test Another Variant of ADA
William Pugh (1991) based on Fourier-Motzkin for integers Presburger Arithmetic
Two slightly different branches one in US, the other in France we mostly talk about the French stuff,
but similar evolution took place with Omega
32
So what is wrong? Can’t we just use this powerful method
all the time?
Dependence Tests Same setting (conflicting memory
accesses) f(i,j) = g(x,y)
Let f and g be affine a0+a1i+a2j = b0+b1x+b2y linear Diophantine equation solution exists if
33
gcd(a1,a2,b1,b2)=|a0-b0|
GCD Test
3i=6x-3y+2 3i-6x+3y=2 gcd(3,6,3) = 2 ?
2i=4x-2y+2 2i-4x+2y=2 gcd(2,4,2) = 2 ?
34
for (i=1; i<N; i++) for (j=0; j<M; j++) A[3*i] = A[6*i-3*j+2] + B[i][j];
for (i=1; i<N; i++) for (j=0; j<M; j++) A[2*i] = A[4*i-2*j+2] + B[i][j];
35
GCD vs ADA ADA is clearly much more precise
(exact) What can ADA say for the following?
for (i=1; i<N; i++) for (j=0; j<i*i; j++) A[i] = foo(A[i])...
Why is GCD Test Inexact? When does GCD test give false positive?
What happens when GCD=1?
GCD test: i = j trivial solution exist
Main problem the space is completely unconstrained
36
for (i=0; i<N; i++) for (j=N; j<M; j++) A[i] = A[j] + B[i][j];
37
Exact vs Exact Array Dataflow Analysis
“exact” dependence analysis GCD Test
inexact dependence test Exact Dependence Tests
no false positives/negatives does not necessary give the producer
Banerjee Test [Banerjee 1976]
Making it slightly better There may be a dependence if
min(f(i,j)-g(x,y))≤0, and 0≤max(f(i,j)-g(x,y))
min(i-j) = 0-(M-1) = 1-M max(i-j) = N-1-N = -1
38
for (i=0; i<N; i++) for (j=N; j<M; j++) { A[i] = A[j] + B[i][j]; }
39
Banerjee Test Intuition interval of 2 functions
40
Banerjee Test Exact or Inexact? Weakness?
41
What happens with 2D arrays? How to formulate?
given read: A[i][j] and write: A[x+1][y+2]
How to formulate?
given read: A[i][i] and write: A[x+1][x+2]
for (i=0; i<N; i++) A[i][j] = A[i+1][j+2];
for (i=0; i<N; i++) A[i][i] = A[i+1][i+2];
42
Dimension-by-Dimension Simple extension
also called subscript-by-subscript Given
A[f1(ivec),f2(ivec),...,fn(ivec)] B[g1(jvec),g2(jvec),...,gn(jvec)]
Check feasibility of: f1 = g1 or f2 = g2 or , ..., f3 = g3
43
Limitations of Dim-by-Dim Is there parallelism in this loop nest?
“coupled” subscript
We need to check for feasibility of: f1 = g1 ∧ f2 = g2 ∧ , ..., ∧ f3 = g3
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
44
Lambda Test [Li et al. 1989]
Multi-dimensional Banerjee Given
A[f1(ivec),f2(ivec),...,fn(ivec)] B[g1(jvec),g2(jvec),...,gn(jvec)]
Check
45
How to get Direction Vectors Pick a direction vector and then test it!
only test relevant vectors to the legality testing for lex. negative vectors can
return true, but makes no sense What makes sense for the following?
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
46
Lambda Test Let’s try [=,<]
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
47
Lambda Test Let’s try [=,<]
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
48
Lambda Test Let’s try [=,<]
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
i,i’
j,j’
ψ2
ψ1
49
Lambda Test Let’s try [=,<]
for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = ... A[2*j][i] = ...
50
Delta Test [Goff et al. 1991]
Further extensions for multiple indices Pragmatic approach
key observation: real programs are not that complicated when it comes to array accesses
1st Step, classify array access (pairs) ZIV (Zero Index Variable) pair SIV (Single Index Variable) pair MIV (Multiple Index Variables) pair
51
Delta Test Classifications ZIV
e.g., A[N], A[10], ... loop invariant
SIV e.g., A[i], A[j], A[i+2], ... only one loop iterator
MIV e.g., A[i+j], A[2*i-j], A[i*j]... when two ore more iterators are
involved
52
Array Access Patterns What do they look like in “real-life”?
1D, 2D, 3D+ arrays
coupled, separate
ZIV, SIV, MIV
53
Delta Test Algorithm 1. Classify accesses 2. Solve the easy cases
if separable ZIV/SIV proves independence, done
3. Solve the harder cases BUT, some information are used from
Step 2. constraint intersection/propagation
54
Constraint Intersection It is sometimes easy to show that
multiple constraints cannot be satisfied at the same time If you have coupled SIV accesses
e.g., A[i,i] = A[i+1, i+2] By analyzing each dim separately, you
get i’ = i+1 and j’=i+2
But you also know that the valid space is i’=j’, i’=j’=i+c
Intersecting everything gives empty set
55
Constraint Propagation Like intersection, SIV gives partial
information e.g., A[i,i+j] = A[i+1, i+j]
i’=i+1 is derived from the 1st dim you then substitute the info to the 2nd
dim A[i’,j’] = A[i+1, i+1+j’] Reformulating the 2nd dim gives
i+1+j’ = i+j which yields
j’=j-1
56
Putting it All Together Delta-Test aims to take advantage of
various properties of how the code is written a collection of many small tricks
It is probably closer to what is in actual compilers than the polyhedral model
57
transition
58
Back to Array Dataflow Result Another view: PRDG
Polyhedral Reduced Dependence Graph reduced vs extended (recall L01)
Node: Statement domain Edge: Dependence (domain + function)for (i=0; i<N; i++) { for (j=0; j<P; j++)S0: A[j] = B[j]+B[j+1]; for (j=0; j<Q; j++)S1: B[j] = A[j];}
S0
S1
0≤i<N0≤j<P
0≤i<N0≤j<Q
59
Polyhedral Objects We will usually use ISL syntax Set
[<params>] -> { [<indices>] : <constraints>}
[N,M]->{ [i,j] : 0<=i<N and 0<=j<M } Relation
[<params>] -> { [<in>] -> [out] : <constraints>}
[N,M]->{ [i,j] -> [x,y] : x=i+1 } Function is a special case of relation
I often use (<indices> → <expression>)
60
Additional Conventions You can name each tuple
Following are NOT equivalent [N,M] -> { S0[i,j] : 0<=i<N and 0<=j<M } [N,M] -> { S1[i,j] : 0<=i<N and 0<=j<M }
Index names DO NOT matter Following are equivalent
[N,M] -> { [i,j] : 0<=i<N and 0<=j<M } [N,M] -> { [x,y] : 0<=x<N and 0<=y<M }
Names of parameters DO matter
61
Set vs Relations They are not really different
[N]->{ [i,j] -> [x,y] : i=x and j=y } [N]->{ [i,j,x,y] : i=x and j=y }
Mostly for convenience when representing program information
Ex1. Dependence S0[i,j] -> S1[i’,j’]
Ex2. Array access S0[i,j] -> A[i]
62
Matrix Representation Polyhedral obj. are often encoded as
matrices Ax + b ≥ 0
A: linear part (matrix) x: indices (symbolic vector) b: constant (constant vector) Px + Ax + b to explicitly separate
params Simply Ax + b for functions
Algebraic properties of A is often used
63
Matrix Form Example { [i,j] : 0≤i<10 and 0≤j<i }
64
Integer Set Library Tool for manipulating sets and relations
mostly by Sven Verdoolaege Kind of does every thing now
manipulating set/relation scheduling code generation PIP counting integer points ...
65
ISL Demo Online interface
http://www2.cs.kuleuven.be/cgi-bin/dtai/barvinok.cgi
66
PRDG Example (Dataflow Edge)
for (i=0; i<N; i++) { for (j=0; j<P; j++)S0: A[j] = foo(A[j]); for (j=0; j<Q; j++)S1: A[j] = bar(A[j]);}
S0
S1
S0[i,j]→S0[i+1,j] : j≥Q S0[i,j]→S1[i,j] : j<Q
S1[i,j]→S1[i+1,j] : j<P S1[i,j]→S1[i+1,j] : j≥P
67
PRDG Example (Dep. Polyhedra)
for (i=0; i<N; i++) { for (j=0; j<P; j++)S0: A[j] = foo(A[j]); for (j=0; j<Q; j++)S1: A[j] = bar(A[j]);}
S0
S1
j≥Q, i’=i+1, j=j’ i’=i, j=j’ : j<Q
i’=i+1, j=j’ : j<P i’=i+1, j=j’ : j’≥P
0≤i<N0≤j<P
0≤i<N0≤j<Q
68
Uniform vs Affine Dependence Uniform dependences
constant offset: <i,j> → <i,j> + c can be described with distance vectors
Affine dependences any affine function: <i,j> → A.[i j]+b uniform when A = I
When do we need affine dependences?
69
PRDG + Expressions = ??? PRDG is an abstraction of dependences
what each statement does is lost
You may want the expressions in you analysis typically when semantic properties are
useful
Polyhedral Equational Model
70
Alpha Language Equational Language
or PRDG + Expressions or Systems of Affine Recurrence
Equations or dynamic single assignment code
Basic structure declaration of the domain of equations affine equations that define the
computation performed at each iteration point
71
Alpha Examplefor (i=1; i<=N; i++) {S0: A[i,i] = foo(); for (j=i+1; j<=M; j++)S1: A[i,j] = A[i,j-1] * A[i,i];}
S0 : [N,M] -> { [i] : 1<=i<=N }S1 : [N,M] -> { [i,j] : i<=i<=N and i<j<=M }
S0[i] = foo();S1[i,j] = case { : j=1} : A [i,j-1] * S0[i]; { : j>1} : S1[i,j-1] * S0[i];esac;
72
Role of Alpha in this Course Polyhedral Equational Model is not
popular within the already niche Polyhedral
Model I know A LOT about it because my
advisor is the main guy working on it It is a good IR to look at both
dependences and expressions it is also suited for teaching some of the
aspects I sometimes use it to replace PRDG
but keep in mind that it is a different view of it
73
Next Time Transforming polyhedral representations Tiling