Loop Induction Variable Canonicalization
Dec 22, 2015
Loop Induction Variable Canonicalization
OutlineOutline
• Motivation• Background: Open64 Compilation Scheme• Loop Induction Variable Canonicalization• Project• Tracing and WHIRL Specification• Loops• References
3/27/2008 2Copyright © 2008 - Juergen Ributzka. All rights reserved.
MotivationMotivation
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 3
How to copy one array to another array?
MotivationMotivation
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 4
for (int i = 0; i < SIZE; i++) { p[i] = q[i];}
int i = 0;while (i < SIZE) { p[i] = q[i]; i = i + 1;}
while (p <= &p[SIZE-1]) { *p++ = *q++;}
int i = 1;if (i <= SIZE) { do { p[i-1] = q[i-1]; } while (i++ <= SIZE);}
One simple problem – many different solutions
MotivationMotivation
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 5
int i = 0;while (i < SIZE) { p[i] = q[i]; i = i + 1;}
Compiler prefer code which is easy to analyze:
while (p <= &p[SIZE-1]) { *p++ = *q++;}
User want high performance code:
Compiler Optimization
Compiler Transformation
MotivationMotivation
• Just one Induction Variable– starting at 0– stride of 1
• Unified Loop representation
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 6
int iv = 0;while (iv <= SIZE-1) { p[iv] = q[iv]; iv = iv + 1;}
Background: Open64 Compilation SchemeBackground: Open64 Compilation Scheme
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 7
Front End
Loop Nest Optimizer(optional)
Global Optimizer
Code Generation
IVR
Loop Induction Variable CanonicalizationLoop Induction Variable Canonicalization
• Step 1: Induction Variable Injection• Step 2: Inserting φ’s and Identity Assignments• Step 3: Renaming• Step 4: Induction Variable Analysis and
Processing• Step 5: Copy Propagation and Expression
Simplification• Step 6: Dead Store Elimination
3/27/2008 8Copyright © 2008 - Juergen Ributzka. All rights reserved.
• At this point we only have DO and WHILE loops– GOTO statements have been transformed to WHILE loops
• Loops are annotated with details of the high-level loop construct• Inject a unit-stride induction variable into
– Non-unit-stride DO loops– All WHILE loops
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 9
p = &a[0];while (p <= &a[99]) { *p = 0; p = p + 1;}
p = &a[0];iv = 0;while (p <= &a[99]) { *p = 0; p = p + 1; iv = iv + 1;}
Before: After:
Step 2: Inserting Step 2: Inserting φφ’s and Identity ’s and Identity AssignmentsAssignments
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 10
Before: After:p ← &a[0]iv ← 0
p ≤ &a[99] ?
*p ← 0p ← p + 4iv ← iv + 1
…
p ← &a[0]iv ← 0
iv ← φ(iv, iv)p ← φ(p, p)p ≤ &a[99] ?
*p ← 0p ← p + 4iv ← iv + 1
iv ← ivp ← p
Insert φ’s
Step 3: RenamingStep 3: Renaming
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 11
Before: After:p ← &a[0]iv ← 0
iv ← φ(iv, iv)p ← φ(p, p)p ≤ &a[99] ?
*p ← 0p ← p + 4iv ← iv + 1
iv ← ivp ← p
p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← φ(p1, p3)p2 ≤ &a[99] ?
*p2 ← 0p3 ← p2 + 4iv3 ← iv2 + 1
iv4 ← iv2
p4 ← p2
Rename variables
Step 4: Induction Variable Analysis and Step 4: Induction Variable Analysis and ProcessingProcessing
• Process φ list at the beginning of the loop• One operand must correspond to the initial value• The other must be defined in the loop• Initialize symbolic expression tree with this operand• Recursively resolve variables in the expression tree
which are not defined by a φ node, except both φ node operands are the same
• All variables in the symbolic expression tree must be now loop invariant or a result of a φ
• i2 is an induction variable, if the expression tree is of the form i2 ± <expr> where i2 is a φ result.
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 12
Step 4: Induction Variable Analysis and Step 4: Induction Variable Analysis and ProcessingProcessing
• i1 and j1 are initial values
Expression Tree:i2 ← i3
i2 ← j3 + 2
i2 ← i2 + 5 (found IV)
j2 ← j3
j2 ← i2 + 3
j2 ← i2 + 3 (can’t resolve i2)
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 13
i2 ← φ(i1, i3)j2 ← φ(j1, j3)
i2 ≤ 100 ?
j3 ← i2 + 3…
i3 ← j3 + 2
…
Example:
Step 4: Induction Variable Analysis and Step 4: Induction Variable Analysis and ProcessingProcessing
• i1 is initial values
Expression Tree:i2 ← i5
i2 ← φ(i3, i4)
i2 ← i3
i2 ← i2 + 1
(found IV)
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 14
i2 ← φ(i1, i5)i2 ≤ 100 ?
i2 < x ?
…i3 ← i2 + 1
Example:
…i4 ← i2 + 1
i5 ← φ(i3, i4)
…
?=
Step 4: Induction Variable Analysis and Step 4: Induction Variable Analysis and ProcessingProcessing
• Select Primary Induction Variable• Compute Trip Count• Exit Values
sexit ← sinit + <tripcount> x sstep
• Define Secondary Induction Variables (s) with Primary Induction Variables (p)s ← sinit + (p – pinit) x sstep
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 15
Step 4: Induction Variable Analysis and Step 4: Induction Variable Analysis and ProcessingProcessing
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 16
Before: After: p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[0]+(iv2-0)x4
p2 ≤ &a[99] ?
*p2 ← 0p3 ← p2 + 4iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← φ(p1, p3)p2 ≤ &a[99] ?
*p2 ← 0p3 ← p2 + 4iv3 ← iv2 + 1
iv4 ← iv2
p4 ← p2
Add exit values and replace φ’s
Step 5: Copy Propagation and Expression Step 5: Copy Propagation and Expression SimplificationSimplification
• Preorder Traversal of the Dominator Tree• If use of x1 is defined by an assignment of the form
x1 ← <expr>, then substitute it by <expr>
• Example:Before: After:x1 ← i1 + j1 x1 ← i1 + j1
y2 ← x1 – y1 y2 ← i1 + j1 – y1
x2 ← y2 + z3 x2 ← i1 + j1 – y1 + z3
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 17
Step 5: Copy Propagation and Expression Step 5: Copy Propagation and Expression SimplificationSimplification
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 18
Before: After: p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[0]+(iv2-0)x4
(&a[0]+(iv2-0)x4) ≤ &a[99] ?
*(&a[0]+(iv2-0)x4) ← 0p3 ← &a[0]+(iv2-0)x4 + 4
iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
Copy Propagation
p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[0]+(iv2-0)x4
p2 ≤ &a[99] ?
*p2 ← 0p3 ← p2 + 4iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
Step 5: Copy Propagation and Expression Step 5: Copy Propagation and Expression SimplificationSimplification
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 19
Before: After: p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[iv2]
iv2 ≤ 99 ?
*a[iv2] ← 0p3 ← &a[iv2] + 4
iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
Simplification
p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[0]+(iv2-0)x4
(&a[0]+(iv2-0)x4) ≤ &a[99] ?
*(&a[0]+(iv2-0)x4) ← 0p3 ← &a[0]+(iv2-0)x4 + 4
iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
Step 6: Dead Store EliminationStep 6: Dead Store Elimination
• Mark all statements dead, except– I/O statements– return statements– procedure calls– statements with side effects (e.g. changes memory)
• Propagate liveness to the rest of the program– for each variable used in a live statement mark its defining
statement alive– mark the conditional branch alive on which the statements
depends• Remove statements which has not been marked alive
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 20
Step 6: Dead Store EliminationStep 6: Dead Store Elimination
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 21
After:
Dead Store Elimination
Before: p1 ← &a[0]iv1 ← 0
iv2 ← φ(iv1, iv3)p2 ← &a[iv2]
iv2 ≤ 99 ?
*a[iv2] ← 0p3 ← &a[iv2] + 4
iv3 ← iv2 + 1
iv4 ← 100p4 ← &a[100]
Project/HomeworkProject/Homework
• Given a loop, trace the intermediate representation (WHIRL) of the Open64 compiler as explained in the next slides. Create a CFG for each trace and explain what changed between each trace. The behavior that will be exposed by your trace will differ in certain aspects to the one presented in this presentation since Open64 has evolved over time.
• Is the result optimal?• What could be improved?• Extra Credit: Explain how the behavior has changed.
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 22
Tracing and WHIRL SpecificationTracing and WHIRL Specification
• After the Front Endopencc -c -O3 -show -keep loop1.cir_b2a loop1.B > loop1.t
• After HSSA creation opencc -c -O3 -Wb,-tt25:0x0100 -PHASE:w=off filename.c(this will give you the trace before and after IVR)
• After Induction Variable Recognition opencc -c -O3 -Wb,-tt25:0x0100 -PHASE:w=off filename.c(this will give you the trace before and after IVR)
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 23
Tracing and WHIRL SpecificationTracing and WHIRL Specification
• After Copy Propagationopencc -c -O3 -Wb,-tt25:0x0020 -PHASE:w=off filename.c
• After Boolean Simplificationopencc -c -O3 -Wb,-tt26:0x0004 -PHASE:w=off filename.c
• After Dead Code Eliminationopencc -c -O3 -Wb,-tt25:0x0080 -PHASE:w=off filename.c
• After each step you will find the trace in filename.t
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 24
Tracing and WHIRL SpecificationTracing and WHIRL Specification
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 25
Example C-Code:int foo (int *p, int size) {
int sum = 0;
int i;
for (i = 0; i < size; i++) {
sum += p[i];
}
return sum;
}
Tracing and WHIRL SpecificationTracing and WHIRL Specification
WHIRL:FUNC_ENTRY <1,20,foo> IDNAME 0 <2,1,p> IDNAME 0 <2,2,size>BODY BLOCK END_BLOCK BLOCK END_BLOCK BLOCK PRAGMA 0 120 <null-st> 0 (0x0) # PREAMBLE_END
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 26
Tracing and WHIRL SpecificationTracing and WHIRL Specification
LOC 1 4 int sum = 0; I4INTCONST 0 (0x0) I4STID 0 <2,3,sum> T<4,.predef_I4,4> LOC 1 5 int i; LOC 1 6 LOC 1 7 for (i=0; i<size; i++) { I4INTCONST 0 (0x0) I4STID 0 <2,4,i> T<4,.predef_I4,4> WHILE_DO I4I4LDID 0 <2,2,size> T<4,.predef_I4,4> I4I4LDID 0 <2,4,i> T<4,.predef_I4,4> I4I4GT
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 27
Tracing and WHIRL SpecificationTracing and WHIRL Specification BODY BLOCK LOC 1 8 sum += p[i]; U8U8LDID 0 <2,1,p> T<28,anon_ptr.,8> I8I4LDID 0 <2,4,i> T<9,.predef_U8,8> U8I8CVT U8INTCONST 4 (0x4) U8MPY U8ADD I4I4ILOAD 0 T<4,.predef_I4,4> T<28,anon_ptr.,8> I4I4LDID 0 <2,3,sum> T<4,.predef_I4,4> I4ADD I4STID 0 <2,3,sum> T<4,.predef_I4,4> LOC 1 7 LABEL L1 0 I4I4LDID 0 <2,4,i> T<4,.predef_I4,4> I4INTCONST 1 (0x1) I4ADD I4STID 0 <2,4,i> T<4,.predef_I4,4> END_BLOCK
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 28
=
sum +
sumload
+
*
4 convert
i
p
Tracing and WHIRL SpecificationTracing and WHIRL Specification
LOC 1 9 }
LOC 1 10
LOC 1 11 return sum;
I4I4LDID 0 <2,3,sum> T<4,.predef_I4,4>
I4RETURN_VAL
END_BLOCK
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 29
Loop 1Loop 1
int loop1 (int *p, int size) { int i = 0; while (i < size) { i = i + 3; p[i] = 0; i = i + 1; }
return 0;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 30
Loop 2Loop 2
int loop2 (int *p, int *q, int size) { int i; for (i=0; i != size; i++) { *p = *q; p = p + 2; q = q + 3; }
return 0;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 31
Loop 3Loop 3
int loop3 (int *p, int *q, int size) { int i = 0; while (i < size) { int j = i + 1; p[j] = 0; i = j + 3; q[i] = 1; }
return 0;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 32
Loop 4Loop 4
int loop4 (int *a, int size) { int *p = a; int *q = &a[size];
while (p != q) { *(++p) = 0; }
return 0;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 33
Loop 5Loop 5
int loop5 (int *a, int size) { int i = 0;
while (i++ < size) { a[i] = 0; }
return 0;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 34
Loop 6Loop 6
int loop6 (int *a, int size, int t) { int i = 0; int sum = 0;
while (i < size) { if (a[i] < t) { i = i + 1; continue; }
sum += a[i]; i = i + 1; }
return sum;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 35
Loop 7Loop 7
int loop7 (int *a, int size) { int i,j; int sum = 0; int k = 0;
for (i = 0; i < size; i++) { for (j = 0; j < size; j++) { sum += a[k]; k = k + 1; } }
return sum;}
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 36
AcknowledgmentsAcknowledgments
• Dr. Fred Chow (PathScale, LLC) • Dr. Handong Ye (CAPSL)
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 37
ReferencesReferences
• Shin-Ming Liu, Raymond Lo and Fred Chow, “Loop Induction Variable Canonicalization in Parallelizing Compilers”
• WHIRL Intermediate Language Specification (http://www.open64.net/documentation/manuals.html)
• How to Debug Open64(Open64/doc/HOW-TO-DEBUG-OPEN64)
3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 38