Top Banner
FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th , 2007
30

FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Dec 14, 2015

Download

Documents

Blaze Collins
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

FFT Accelerator ProjectRohit PrakashAnand Silodia

Date: June 7th , 2007

Page 2: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Objectives

• Analysis using random input points

• %age improvement (from the previous implementations)

• Cache profiling

Page 3: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Improvements

• Calls to sine/cosine decreased• Separate arrays for power, some

other terms– Division decreased– Multiplications decreased

• Error in last time corrected (FFTW floating point)

Page 4: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

System Configuration

• Intel Pentium 4 (HT) 3.0Ghz• RAM : 1GB• Cache : 1MB L2• O.S. : Fedora Core 3• Compiler icc• Flags used : -xW, -O3, -ipo-prec-

div, -static

Page 5: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

User time : vs. FFTW (single precision)

Radix-4 works 1.5 times slower than fftw

Radix-8 works 1.6 times slower than fftw

Page 6: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

User time : previous (double) vs. new (float)

Approximately 20% improvement

Page 7: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

User time : previous (double) vs new (float)

Approximately 19% improvement

Page 8: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Cache Organization

Cache Level

Size Associativity

Line size

L2 1 MB 8-way 64

I1 16 KB 4-way 64

D1 16KB 4-way 64

Page 9: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Radix-4 L2 misses

Approximately 30% less L2 misses

Page 10: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Radix-4 D1 misses

Approximately 1.6% less D1 misses

Page 11: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Radix-8 L2 misses

Approximately 13.6% less L2 misses

Page 12: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Radix-8 D1 misses

Approximately .96% less D1 misses

Page 13: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 14: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 15: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results : using vtune

Page 16: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 17: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 18: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 19: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 20: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 21: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 22: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 23: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 24: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 25: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 26: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using vtune

Page 27: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 28: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Profiling results: using gprof

Page 29: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Further Improvements : use sse instructions• Vectorize the loop

TA[r]Uw*A[r+p]Vw*w*A[r+2*p]Ww*w*w*A[r+3*p]----------------------------------Complex temp[4];For(i = 1; i<4;i++){

temp[i] = twiddle[i*p]*A[r+ i*l]

}

Page 30: FFT Accelerator Project Rohit Prakash Anand Silodia Date: June 7 th, 2007.

Thank You