The FFT on a GPU Graphics Hardware 2003 July 27, 2003 Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
25
Embed
The FFT on a GPU Graphics Hardware 2003 July 27, 2003 Kenneth MorelandEdward Angel Sandia National LabsU. of New Mexico Sandia is a multiprogram laboratory.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The FFT on a GPU
Graphics Hardware 2003
July 27, 2003
Kenneth Moreland Edward AngelSandia National Labs U. of New Mexico
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Graphics Hardware 20032
Overview
• Introduction– Motivation, FFT review.
• FFT Techniques– Exploitable FFT properties.
• Implementation• Results
– Performance, applications, conclusions.
Graphics Hardware 20033
• The Fourier transform is a principal tool for digital image processing.– Filtering.
– Correction.
– Compression.
– Classification.
– Generation.
• As such, should not our graphics hardware support such a tool?
Motivation
Graphics Hardware 20034
The Discrete Fourier Transform
• Converts data in the spatial or temporal domain into frequencies the data comprise.
1
0
1 N
x
uxNWxfN
uFxfF
1
0
1N
u
uxNWuFxfuFF
NjN eW 2
Graphics Hardware 20035
The Discrete Fourier Transform
• 2D transform can be computed by applying the transform in one direction, then the other.
1
0
1
0
,1
,,N
y
M
x
vyN
uxMWWyxf
MNvuFyxfF
1
0
1
0
1 ,,,N
v
M
u
vyN
uxM WWvuFyxfvuFF
DFT
IDFT
Graphics Hardware 20036
The Fast Fourier Transform
• Divide and Conquer Algorithm– Input sequence is divided into subsequences
consisting of values from even and odd indices, respectively.
uFWuFuF uN
oe
xfxf 2e 12o xfxf
Graphics Hardware 20037
Index Magic
• Do not use recursion.– Use dynamic programming: iterate over entire array
computing all values for each recursive depth together, like mergesort.
• Indexing is non-obvious.– Unlike mergesort, recursive step does not divide
array into contiguous chunks.
– At any iteration, what partition does a given index belong to, and where can one find the applicable values of the sub-partitions?
Graphics Hardware 20038
Index Magic
• Common solution: rearrange data by reversing the bits of indices.– FFT can occur with contiguous partitions.
– Requires an extra data copy.
• Our solution, determine indexing in place.
iii
uiii NNunAWNunAnA i 222 121
Note that the paper has a typo.
iNnu 2 div
Graphics Hardware 20039
Fourier Symmetry of Real Sequences
• In general, the frequency spectra of even real functions contain imaginary values.– Captures magnitude and phase shift of sinusoids.
• Brute force FFT doubles computation and storage costs.
• But, Fourier transforms of real functions have symmetry.–
– Values at and are real (because they are conjugates with themselves).
uNFuFu *, 0F 2
NF
Graphics Hardware 200310
Fourier Transform of Real Functions
• Pick two functions, let them be f(x) and g(x).
• Let h(x) = f(x) + j g(x).– Note that there is no loss of
information.• Can perform FFT of h in half the
time as performing the brute force FFT of f and g individually.– Simply point to one row of
image as real components and another as imaginary components.