Top Banner
DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012
14

DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Dec 23, 2015

Download

Documents

Kelley McCarthy
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

DATA LOCALITY & ITSOPTIMIZATIONTECHNIQUES

Presented by

Preethi Rajaram

CSS 548 Introduction to Compilers Professor Carol ZanderFall 2012

Page 2: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Why?• Processor Speed - increasing at a faster rate than the

memory speed

• Computer Architectures -more levels of cache memory

• Cache - takes advantage of data locality

• Good Data Locality - good application performance

• Poor Data Locality - reduces the effectiveness of the cache

Page 3: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Data Locality• It is the property that, references to the same memory location or

adjacent locations are reused within a short period of time

• Temporal locality

• Spatial locality

Fig: Program to find the squares of the differences (a) without loop fusion (b) with loop fusion

[Image from: The Dragon book 2nd edition]

Page 4: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Matrix Multiplication - Example

Fig: Basic Matrix Multiplication Algorithm

[Image from: The Dragon book 2nd edition]

• Poor data locality• N2 multiply add operations separates the reuse of same data element in

matrix Y• N operations separate the reuse of same cache line in Y

• Solutions• Changing the layout of the data structures• Blocking

Page 5: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Matrix Multiplication – Example Contd…

• Changing the data structure layout• Store Y in column-major order• Improves reuse of cache lines of matrix Y• Limited Applicability

• Blocking• Changes the execution order of instructions• Divide the matrix into submatrices or blocks• Order the operations such that entire block is used over a short period of

time• Choose B such that, one block from each of the matrices fits into cache

Image from: The Dragon book 2nd edition

Page 6: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Data Reuse• Locality Optimization• Identify set of iterations that access the same data or same cache line• Static Access- an instruction in a program e.g x = z[i,j]• Dynamic Access- execution of instruction many times as in a loop nest• Types of Reuse

• Self• Iterations using same data come from same static access

• Group• Iterations using same data come from different static access

• Temporal• If the same exact location is referenced

• Spatial• If the same cache line is referenced

Page 7: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Self Temporal Reuse• Save substantial memory by exploiting self reuse• n(d-k) times reused for data with ‘k’ dimensions in a loop nest of depth

‘d’

e.g. 3-deep nested loop accesses one column of an array, then there is a potential saving accesses of n2 accesses

• Dimensionality of access- Rank of the matrix in access• Iterations referring to the same location – Null Space of a matrix• Rank of a Matrix

• No. of rows or columns that are linearly independent

• Null Space of a matrix• A reference in ‘d’ deep loop nest with ‘r’ rank, accesses O(nr) data elements in O(nd)

iterations, so on an average, O(nd-r) iterations must refer to the same array element

Rank = Dimensionality = 22nd row = 1st + 3rd 4th row = 3rd – 2* 1st

Nullity = 3-2 = 1 Loop depth = 3Rank = 2

Page 8: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Self Spatial Reuse• Depends on data layout of the matrix – e.g. Row major

order• In an array of ‘d’ dimension, array elements share a cache

line if they differ only in the last dimensione.g. Two array elements share the same cache line if and only if they share the same row in a 2-D array

• Truncated matrix is obtained by dropping of the last row from the matrix

• If the resulting matrix has a rank ‘r’ that is less than depth ‘d’, we can assure for spatial reuse

Truncated Matrix, r = 1, d = 2r<d, assures spatial reuse

Page 9: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Group Reuse• Group reuse only among accesses in a loop sharing the

same coefficient matrix

Fig: 2-deep loop nest

[Image from: The Dragon book 2nd edition]

• z[i,j] and z[i-1,j] access almost the same set of array elements

• Data read by access z[i-1,j] is same as the data written by z[i,j], except for i = 1

Rank = 2, no self temporal reuse

Truncated Matrix, Rank = 1, self spatial reuse

Page 10: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Locality Optimization• Temporal Locality of data

Use the results as soon as they are generated

Fig: Code excerpt for a multigrid algorithm (a) before partition (b) after patition [Image from: The Dragon book 2nd edition]

Page 11: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Locality Optimization Contd…• Array Contraction

Reduce the dimension of the array and reduce the number of memory locations accessed

Fig: Code excerpt for a multigrid algorithm after partition and after array contractionImage from: The Dragon book 2nd edition

Page 12: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Locality Optimization Contd…• Instead of executing each partition one after the other; we interleave a number of the

partitions so that reuse among partitions occur close together

• Interleaving Inner Loops in a Parallel Loop

• Interleaving Statements in a Parallel Loop

Fig: The statement interleaving transformation [Image from: The Dragon book 2nd edition]

Fig: Interleaving four instances of the inner loop[Image from: The Dragon book 2nd edition]

Page 13: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

References• Wolf, Michael E., and Monica S. Lam. "A data locality optimizing algorithm."

ACM Sigplan Notices 26.6 (1991): 30-44.

 • McKinley, Kathryn S., Steve Carr, and Chau-Wen Tseng. "Improving data

locality with loop transformations." ACM Transactions on Programming Languages and Systems (TOPLAS) 18.4 (1996): 424-453.

 • Bodin, François, et al. "A quantitative algorithm for data locality optimization."

Code Generation: Concepts, Tools, Techniques (1992): 119-145.

 • Kennedy, Ken, and Kathryn S. McKinley. "Optimizing for parallelism and data

locality." Proceedings of the 6th international conference on Supercomputing. ACM, 1992.

 • Compilers ‐ Principles, Techniques, and Tools by A. Aho, M. Lam (2nd

edition), R. Sethi, and J.Ullman, Addison‐Wesley.

Page 14: DATA LOCALITY & ITS OPTIMIZATION TECHNIQUES Presented by Preethi Rajaram CSS 548 Introduction to Compilers Professor Carol Zander Fall 2012.

Thank You!

Questions??