Top Banner
Optimization for Loop Execution Targeting DSP with Auto- Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications Research Laboratories *CS Department, NTHU Taiw an
19

Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

Addressing Optimization for Loop Execution Targeting DSP

with Auto-Increment/Decrement Architecture

Wei-Kai Cheng

Youn-Long Lin*

Computer & Communications Research Laboratories

*CS Department, NTHU Taiwan

Page 2: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

2

Overview

Features: – Auto-Increment/Decrement for Address

Generation– Constraints for Loop Execution

Optimization Methods:– Multi-Phase Data Ordering– Graph-Based Address Register Allocation

» Block Access Graph

Page 3: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

3

Auto-Increment/Decrement

Page 4: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

4

New Constraints

Loop Execution– Data Ordering Constraint– Address Register Allocation Constraint

Architectural Constraint– Different arrays are stored in disjoint memory

space– Multiple auto-increment/decrement ranges in

the instruction set architecture

Page 5: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

5

Data Ordering Constraint

Page 6: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

6

Address Register Allocation Constraint

Page 7: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

7

Architectural Constraint

Page 8: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

8

Data Lists

Page 9: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

9

Approach

Split the access sequence into data lists– Array– Iteration Stride

Data Ordering Address Register Allocation

– Data Lists Merging or Splitting

Page 10: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

10

Access Graph

Page 11: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

11

Data Ordering

Page 12: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

12

Address Register Allocation

# data lists > # address registers:– data list merging

# data lists < # address registers:– data list splitting

Page 13: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

13

Two-Way Data List Splitting

Page 14: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

14

Block Access Graph Construction

Page 15: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

15

Block Access Graph Partition

Page 16: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

16

Experimental Results

0

2

4

6

8

10

fir

iir

dft

fft

edge dct

wie

ner

#data lists

#ordering applied

* number of data lists and data ordering applied

Page 17: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

17

Experimental Results (Cont.)

0

0.2

0.4

0.6

0.8

1

fir iir dft fft edge dct wiener

T.o+T.a

O.o+T.a

T.o+O.a

O.o+O.a

* ratio over TI’s compiler in term of inserted instructions

T: TI’s compilerO: Our algorithm

o: data orderinga: address register allocation

Page 18: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

18

Experimental Results (Cont.)

0

0.2

0.4

0.6

0.8

1

fir iir dft fft edge dct wiener

T.o+T.a

O.o+T.a

T.o+O.a

O.o+O.a

* ratio over TI’s compiler in term of execution cycles

T: TI’s compilerO: Our algorithm

o: data orderinga: address register allocation

Page 19: Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.

19

Conclusions

Data ordering is not so effective in loop execution

Data list splitting is more important than data list merging