and Fast Fourier transform (FFT)

CALIFORNIA STATE UNIVERSITY NORTHRIDGE

Audio/Image Processing in Frequency Domain Using 2D FFT

A graduate project submitted in partial fulfillment of the requirements

For the degree of Master of Science

in Electrical Engineering

By

Ameneh Mousavi

December 2014

ii

The graduate project of Ameneh Mousavi is approved:

__________________________________ ______________________

Dr. Xiyi Hang Date

__________________________________ ______________________

Dr. Ramin Roosta Date

__________________________________ ______________________

Dr. Shahnam Mirzaei, Chair Date

California State University, Northridge

iii

Acknowledgement

I would never have been able to finish my master project without the guidance of

my advisor, help from committee members and support from my family and husband.

I would like to express my deepest gratitude to my advisor, Dr. Shahnam Mirzaei, for

his guidance, caring, patience, motivation, and enthusiasm. I would like to thank

Professor Roosta , who has always believed in me, and supported me throughout my

master studies. His advice, support, and friendship have been invaluable on both

academic and personal levels. I also would like to thank Dr. Hang for his support for

being my project committee member. I really appreciate his time and consideration

toward helping me.

I would like to thank my parents who have always supported me and encouraged me

with their love, and dedication. I would have never been able to get here without their

support.

Last but not the least I would like to thank my husband, Roozbeh, who was always there

cheering me up and stood by me through the whole good and bad times.

iv

I dedicate this graduate project to

my family, and my beloved husband, Roozbeh

for their constant support and unconditional love.

I love you all dearly.

v

Table of Contents

Signature page .................................................................... Error! Bookmark not defined.

Acknowledgement ............................................................................................................. iii

List of Figures .................................................................................................................. viii

Abstract ............................................................................................................................ xiii

Fourier Transform, Fast Fourier Transform and their applications .................................... 1

Introduction to Fourier Transform .................................................................................. 1

Discrete Fourier Transform (DFT) and Fast Fourier transform (FFT) ........................... 3

Cooley-Tukey FFT Algorithm ........................................................................................ 4

FFT in Image Processing ................................................................................................ 8

FFT and Spectrogram ................................................................................................... 10

FPGA design process ........................................................................................................ 13

Introduction to FPGA ................................................................................................... 13

FPGA vs. ASIC ............................................................................................................. 14

FPGA Architecture ....................................................................................................... 16

FPGA design process .................................................................................................... 20

Design entry .................................................................................................................. 22

Test development .......................................................................................................... 22

Behavioral simulation ................................................................................................... 23

vi

Design synthesis............................................................................................................ 23

Place and route .............................................................................................................. 23

Timing analysis ............................................................................................................. 24

Post-synthesis simulation (timing simulation) .............................................................. 24

FPGA programming...................................................................................................... 25

Hardware debug and verification .................................................................................. 25

Chipscope Xilinx test and debug tool ............................................................................... 26

An introduction to chipscope ........................................................................................ 26

Chipscope structure ....................................................................................................... 27

ILA core ........................................................................................................................ 27

ICON core ..................................................................................................................... 28

VIO core........................................................................................................................ 28

How to connect the chipscope cores and setup the test system .................................... 28

Spectrogram implementation using Matlab and FPGA .................................................... 31

spectrogram system implementation using Matlab ....................................................... 31

FPGA based spectrogram system ................................................................................. 40

FFT Core Implementation in FPGA ............................................................................. 44

Audio processing using implemented spectrogram .......................................................... 51

Spectrogram Audio result analysis ............................................................................... 73

Image processing using 2D FFT Matlab and FPGA ......................................................... 76

vii

2D FFT implementation using Matlab .......................................................................... 76

2D FFT implementation on FPGA ............................................................................... 80

Conclusion ........................................................................................................................ 88

References ......................................................................................................................... 90

Appendix A ....................................................................................................................... 92

Appendix B ....................................................................................................................... 97

viii

List of Figures

Figure 1. Fourier transform of cosine function which oscillates 3 cycles per second ........ 2

Figure 2. Fourier Transform of a step function................................................................... 2

Figure 3. Splitting N point DFT to two N/2 point DFTs .................................................... 6

Figure 4. Cooley Tukey splitting for 8 point DFT .............................................................. 7

Figure 5. FFT of an image that has all frequencies............................................................. 9

Figure 6. FFT of an image with Vertical wide stripes ........................................................ 9

Figure 7. FFT of an image with diagonals stripes .............................................................. 9

Figure 8. Spectrograms of a Wyle’s scream call .............................................................. 11

Figure 9. Spectrogram of a Wyle’s Moan Call ................................................................. 11

Figure 10. 3D surface spectrogram of a piece of music ................................................... 12

Figure 11. FPGA architecture ........................................................................................... 17

Figure 12. Programmable Interconnect details ................................................................. 18

Figure 13. A basic CLB structure ..................................................................................... 18

Figure 14. Logic Cell structure ......................................................................................... 19

Figure 15. FPGA design process ...................................................................................... 21

Figure 16. Chipscope debug cores connection to core under the test ............................... 29

Figure 17. Matlab code written to compute FFT on a combined Sine wave ................... 32

Figure 18. FFT computed using Matlab function and self-implemented one no noise at

input .................................................................................................................................. 33

Figure 19. FFT computed using Matlab function and self-implemented with noise added

to input signal .................................................................................................................... 34

ix

Figure 20. Matlab code modification to read from a text file to load input of FFT ......... 35

Figure 21. Matlab code that generates the input file for Matlab and FPGA spectrogram

system ............................................................................................................................... 36

Figure 22. Spectrogram of Blueatlx Wale sound .............................................................. 37

Figure 23. Spectrogram of BluePacx Wale sound ............................................................ 38

Figure 24. Spectrogram of Eaglet bird sound ................................................................... 38

Figure 25. Spectrogram of Falcon bird sound .................................................................. 39

Figure 26. Spectrogram of Mallard Duck quacking sound ............................................... 39

Figure 27. Block diagram of the spectrogram system implemented in FPGA ................. 40

Figure 28. Timing diagram of the control signals of the FFTcore .................................... 41

Figure 29. FFT Timing input signals simulation result, generated by the VHDL code (part

1) ....................................................................................................................................... 42


2) ....................................................................................................................................... 42


3) ....................................................................................................................................... 43


4) ....................................................................................................................................... 43

Figure 33. pipelined streaming IO FFT core implementation in Xilinx FPGA family .... 44

Figure 34. Block diagram of the system with chipscope cores connection ...................... 46

Figure 35. chipscope spectrogram result for sin(50Hz) + sin(120Hz)............................. 47

Figure 36. Matlab systems spectrogram result for sin(50Hz) + sin(120Hz).................... 47

Figure 37. chipscope spectrogram result for Eaglet Bird sound ....................................... 48

x

Figure 38. chipscope spectrogram result for Falcon Bird sound ...................................... 48

Figure 39. chipscope spectrogram result for Mallard Duck quacking .............................. 49

Figure 40. chipscope spectrogram result for horned Owl sound ..................................... 49

Figure 41. Matlab module to read, process, and save the result automatically (part1) .... 52

Figure 42. Matlab module to read, process, and save the result automatically (part2) .... 53

Figure 43. Spectrogram result for Solo Piano (sample 1) ................................................. 55

Figure 44. Spectrogram result for Solo Piano (sample 2) ................................................. 55

Figure 45. Spectrogram result for Solo Guitar (sample 1) ................................................ 56

Figure 46. Spectrogram result for Solo Guitar (sample 2) ................................................ 56

Figure 47. Spectrogram result for Solo Saxophone (sample 1) ........................................ 57

Figure 48. Spectrogram result for Solo Saxophone (sample 2) ........................................ 57

Figure 49. Spectrogram result for Solo Violin (sample 1) ................................................ 58

Figure 50. Spectrogram result for Solo Violin (sample 2) ................................................ 58

Figure 51. Spectrogram result for Solo Drum (sample 1) ................................................ 59

Figure 52. Spectrogram result for Solo Drum (sample 2) ................................................ 59

Figure 53. Spectrogram result for Solo Flute (sample 1) .................................................. 60

Figure 54. Spectrogram result for Solo Flute (sample 2) .................................................. 60

Figure 55. Spectrogram result for Heavy Metal music (sample 1) ................................... 61

Figure 56. Spectrogram result for Heavy Metal music (sample2) .................................... 61

Figure 57. Spectrogram result for RAP music (sample 1) ................................................ 62

Figure 58. Spectrogram result for RAP music (sample 2) ................................................ 62

Figure 59. Spectrogram result for Country Music (sample 1) .......................................... 63

Figure 60. Spectrogram result for Country Music (sample 2) .......................................... 63

xi

Figure 61. Spectrogram result for ROCK music (sample 1) ............................................ 64

Figure 62. Spectrogram result for ROCK music (sample 2) ............................................ 64

Figure 63. Spectrogram result for JAZZ music (sample 1) .............................................. 65

Figure 64. Spectrogram result for JAZZ music (sample 2) .............................................. 65

Figure 65. Spectrogram result for Techno music (sample 1) ............................................ 66

Figure 66. Spectrogram result for Techno music (sample 2) ............................................ 66

Figure 67. Spectrogram result for classical music from Beethoven (sample 1) ............... 67

Figure 68. Spectrogram result for classical music from Beethoven (sample 2) ............... 67

Figure 69. Spectrogram result for classical music from Tchaikovsky (sample 1) ............ 68

Figure 70. Spectrogram result for classical music from Tchaikovsky (sample 2) ............ 68

Figure 71. Spectrogram result for classical music from Mozart (sample 1) ..................... 69

Figure 72. Spectrogram result for classical music from Mozart (sample 2) .................... 69

Figure 73. Spectrogram result for classical music from Vivaldi (sample 1) .................... 70

Figure 74. Spectrogram result for classical music from Vivaldi (sample 2) .................... 70

Figure 75. Spectrogram result for classical music from Bach (sample 1) ....................... 71

Figure 76. Spectrogram result for classical music from Bach (sample 2) ........................ 71

Figure 77. Spectrogram result for classical music from Schubert (sample 1) .................. 72

Figure 78. Spectrogram result for classical music from Schubert (sample 2) .................. 72

Figure 79. Two different plays of Adoration of the earth ................................................. 74

Figure 80. Two different plays of Kiss of the earth .......................................................... 75

Figure 81. Matlab module for 2D FFT on image (part1) .................................................. 77

Figure 82. Matlab module for 2D FFT on image (part2) .................................................. 78

Figure 83. Image and its 2D FFT result in ........................................................................ 79

xii

Figure 84. Image and its 2D FFT result in Matlab ........................................................... 79

Figure 85. Image and its 2D FFT result in Matlab ........................................................... 79

Figure 86. 2D FFT Matlab code (part 1)........................................................................... 81

Figure 87. 2D FFT Matlab code (part 2)........................................................................... 82

Figure 88. fft implementation usinf VHDL (part1) .......................................................... 84

Figure 89. 2D fft implementation usinf VHDL (part2) .................................................... 85



xiii

Abstract

Audio/Video Processing in Frequency Domain Using 2D1 FFT

2

By

Ameneh Mousavi

Master of Science in Electrical Engineering

FFT and similar frequency transforms have a lot of different applications in

today’s advanced technology. FFT is a mathematical method to convert a signal from

time domain to frequency domain. Frequency domain transforms like FFT are widely

used in image processing and enhancement techniques. Medical devices like MRI3 and

CT4 scan are using image processing using FFT to process patient’s body images. It’s

also being used in audio and speech processing. The objective of this project is to use

FFT to process audio signals and create their Spectrogram in order to differentiate

between different music styles and instruments.

The work includes processing different animal voices to detect their frequency

domain and also find difference between their voices in different situation. Other part is

to process different music styles and instruments using spectrogram to see if we could use

1 Two Dimensional

2 Fast Fourier Transform

3 Magnetic Resonance Imaging

4 Computerized Tomography

xiv

it to distinguish between different instruments in a play, music styles, or musicians

without listening to the music itself.

The other area of concentration is using FPGA to implement the spectrogram and

adding chipscope IP5 core to the hardware to be able to test and debug the implemented

Spectrogram device and also use the chipscope to show the results on computer screen.

Introduction of FPGA was mostly for the purpose of testing and debugging, but now a

day because of its fast time to market time and also ease of use, it’s being used frequently

to design different digital systems in several applications. Hardware modules have been

designed using VHDL6 programming Language and implemented using Atlys

TM

Spartan-6 Xilinx FPGA7 Evaluation board.

5 Intellectual Property

6 VHSIC(Very High Speed Integrated Circuit) Hardware Description Language

7 Field Programmable Gate Array

1

Fourier Transform, Fast Fourier Transform and their applications

Introduction to Fourier Transform

Fourier Transform has been introduced by Joseph Fourier using its known Fourier

series to transform a signal from time domain to the frequency domain. There is also an

Inverse Fourier Transform which is used to do the transform in the reverse direction from

frequency to time domain. Using this method we could find out what frequency

components exists in the processed input signal. [1]

As mentioned, the origin of this essential method comes from Fourier series

which rewrites a complicated signal using sum of Sine and Cosine components. The

formula which is used to transform an analog signal is

. This

complex exponential component is the main part of the transform and is coming from

Euler’s formula for Sine and Cosine functions. Based on this formula:

. Complex Exponentials are periodic and a set of them is complete,

so Fourier transform is able to represent a continuous function with less error in compare

to the original one. This property has lead this transform to be one of the most useful and

functional transforms [2].

In addition to indication the number of frequencies that exist in a signal, Fourier

transform also tell us how much of each frequency component presents. [4] A plain-

English metaphor answers some questions about concept of Fourier transform. Here is

the metaphor [3]:

2

- What does the Fourier transform do? Given a smoothie, it finds the recipe.

- How? Run the smoothie through filters to extract each ingredient.

- Why? Recipes are easier to analyze, compare, and modify than the smoothie itself.

- How do we get the smoothie back? Blend the ingredient.

In fact, Fourier transform finds existing frequencies of any function. Figure 1 [2] and 2

[4] show two different example of how Fourier transform works.

Figure 1. Fourier transform of cosine function which oscillates 3 cycles per second

Figure 2. Fourier Transform of a step function

3

Discrete Fourier Transform (DFT) and Fast Fourier transform (FFT)

Fourier transform application is in the analog world, so to be able to use this

fascinating tool in the digital word we needed to have a digitalized version of it. DFT is

the digitalized version of Fourier transform. In fact, DFT converts a series of signal

samples into their frequency components. The input and output samples are both complex

numbers. DFT is one of the most important transforms in digital signal processing area.

The goal of having discrete transform was to perform Fourier transform on the computer

data, and having a limited number of samples in DFT made this possible. The precision

of the transform depends on the number of input samples, so by having more signal

samples we can find more frequency components of the input signal. The definition of

DFT for a series of N complex number x0, x1, …, xN-1 is as equation below [5]:

Efficiency is an important factor in computer processing system. Using above

formula to compute DFT wasn’t that efficient, so they had to find a better algorithm to

compute DFT on discrete data. FFT is an algorithm to compute DFT fast and efficient. It

does that by factorizing the DFT matrix into the product of sparse factors, as the result in

many applications FFT is being used as the major transform algorithm. A regular DFT

formula has the complexity order of O(N2), while FFT complexity order has been

reduced to O(N log N) which is really fundamental in digital processing speed. There are

several algorithms implementing FFT, the common one (which also has been used in this

project) is called Cooley-Tukey algorithm. Other FFT algorithms include Prime-factor,

Bruun’s, Rader’s and Bluestein’s FFT algorithm [6].

4

Cooley-Tukey FFT Algorithm

Cooley-Tukey is an algorithm that efficiently computes the DFT and reduces the

complexity of the DFT. It was introduced by Gauss, but it wasn’t recognized at that time.

In 1965 Cooley and Tukey published a paper regarding this algorithm and explained how

to perform it on the computer. At that time because digital computers were growing and

there was a need to compute the DFT fast, this algorithm got recognized. It uses butterfly

method to compute the FFT[7][8][9][10].

The idea is coming from splitting an N point DFT to two N/2 point DFT. One is

performed over odd samples and the other one over even samples. Expressions below

show how splitting the DFT to two N/2 point DFTs will reduce the complexity of the

computation[7][8][9][10].

This is the traditional DFT formula and we need N complex multipliers, and N-1

complex adds to compute that for each k, so for all N samples it will be O (N2)

complexity. But with FFT we’ll get to the complexity which is a big difference

in big N values. To start we define as

and split out N points to two N/2 points

and calculate the DFT on each one. So we’ll have[7][8][9][10].

5

Then we replace even onens with n = 2r and odd ones with n = 2r + 1, r = 0, 1, …, N/2 -1

and we’ll have:

We’ll factor out the terms of W that doesn’t depends on the r:

Based on the W features, we know that

, by using that

we’ll have:

So and the complexity is going to be O(N

2 /2 + N). Figure 3

shows how to split the N samples to two groups and then combine them back to generate

the whole N point DFT[7][8][9][10].

6

Figure 3. Splitting N point DFT to two N/2 point DFTs

To get the most efficient method we’ll split the sample points till we get to the 2

point FFT. To get there we need to split it times. Figure 4 shows the Cooley Tukey

splitting algorithm for an 8 point input. The order of the samples at input after the

splitting is based on bit reverse order. It means when we represent the index value of the

sample input in binary and then reverse the bits, you’ll find the location of that sample.

Like 4 is 0100 and when we reverse it we’ll get 0010 which will be 2, or 6 (0010) will be

0100 [7][8][9][10].

7

Figure 4. Cooley Tukey splitting for 8 point DFT

Fast Fourier Transform Applications

FFT has a very useful algorithm which is used in a wide range of engineering,

mathematics and science. Some of these applications include Integer multiplication

(which makes a more efficient than “left-shifting-and-adding” or “Russian Peasant” ),

Signal processing (like capturing the human voice close to microphone based on air

pressure, trace the pattern of the stars at night), Image processing ( Medical imaging

devices like MRI, CT scan), filtering application ( because of being fast and efficient

plays a major rule in most of the filtering process and complex matrix

multiplication)[11][12]. All in all, whenever we are looking for a fast efficient method of

processing large amount of data FFT is the solution.

8

FFT in Image Processing

As mentioned above, Image processing is one of the most important areas of FFT

application. It is used in image analysis, image filtering, image construction, image

recognition (to find special objects in the picture), image enhancement (like improving or

changing the image) and image compression (to reduce the size of the picture in order to

make the transmission fast or need less space to store it). FFT on an image shows the

frequencies that has at least 5% of the main peak. Figure 5, 6 and 7 show some pictures

and their FFT result. In these pictures the existing frequencies in image are less than

1/100 of the DC-value, so they don’t have significant effect to the image[13][14].

Image in Figure 5 has almost all frequencies and the magnitude of each frequency

is much less than DC-value in image so it’s all black. Figure 6 shows the Fourier

transform a 2 pixels wide vertical stripes image. If we look at the result, we’ll see that it

contains DC-value and also two points corresponding to the frequencies of the stripes in

the original image. The reason that these two points are on the horizontal line in the

center is that intensity in time domain changes horizontally on this picture. Image 7

shows the Fourier transform of a diagonal strip. In this figure there are also two

frequencies and DC-value [13][14].

9

Figure 5. FFT of an image that has all frequencies

Figure 6. FFT of an image with Vertical wide stripes

Figure 7. FFT of an image with diagonals stripes

10

To process images using FFT two dimensional FFT is being used. One of the

methods is to use one dimensional FFT to perform a two dimensional one. This method

has been used in this project. The way that we perform is to save picture pixels in a 2D

matrix and then apply one dimensional FFT to all rows and over write the result on the

same rows and then apply the FFT on all columns and matrix resulted will be the 2D FFT

of the original image.

FFT and Spectrogram

Spectrogram is a representation of the Spectrum of frequencies of any signal that

varies in time domain or some other variables. This device is used to analyze the audio

signals like animals, music, and human speech. The X- axis in the graph of spectrogram

shows frequencies and Y- axis would be the amplitude of each frequency [19]. Using

spectrogram in speech processing, we can detect every phoneme using its own unique

frequency. The phonemes also combine a detectable way to create vowels and words. So

it would be really useful to have the spectrogram of a human speech, so it widely is used

in study of phonetics and speech synthesis [15].

Animal’s calls like Wyle have an especial frequency which can be detected using

spectrogram. Spectrogram also can be used to recognize, identify and interpret bird call

sounds. Spectrogram is used in improving the speech defects and also speech training to

people who are deaf. They are used in speech filtering, development of RF8 and

microwave systems [16]. Figure 8 shows spectrogram of a Wyle’s scream call while

Figure 9 shows Wyle’s Moan call and Figure 10 shows a 3D surface Spectrogram of a

8 Radio Frequency

11

part of a piece of music. As you can see each scream start high and drop fast, but Moan

last longer than scream [17].

Figure 8. Spectrograms of a Wyle’s scream call

Figure 9. Spectrogram of a Wyle’s Moan Call

12

Figure 10. 3D surface spectrogram of a piece of music

One of the implementation methods of spectrogram is using FFT to get the frequency

components and then calculate the amplitude of the FFT result. This amplitude is going

to be the spectrogram result which shows how much of each frequency we have in the

input signal. The way that spectrogram has been implemented in this project is using

FFT.

13

FPGA design process

Introduction to FPGA

The first kind of Programmable Logic devices were PROMs9 and PLDs

10. They both

had the ability of being programmed at the factory. By the time different companies like

Altera and Xilinx have started working on Programmable devices and introduced

CPLDs11

and FPGAs to the electronic market. CPLD was the generation before FPGA

and its complexity is somewhere between PALs12

and FPGAs. FPGAs are placed in the

class of integrated circuits which are designed to be configured by customers to build

their desired systems. The first purpose of creating programmable gate arrays was to test

and debug ASICs before manufacture them. But because of their interesting features and

capabilities they have been used to implement custom design digital systems by

themselves [18].

One of the most important features of the FPGA which has leaded it to be used more

often by different electronic development companies is being re-programmable.

Designers can design, test and debug their final product on FPGA several times and even

after manufacturing they can add new features and modify their design just by

reprogramming the FPGA used on the product. The other essential thing about FPGA is

the fast time to market feature. In this advanced electronic world that every day

9 Programmable Read-Only Memories

10 Programmable Logic Devices

11 Complex Programmable Logic Devices

12 Programmable Array Logic

14

thousands of different things are introduced to the market, having the ability of building

your product as early as possible is a vital need.

New FPGA devices in addition to lot of logic arrays and block RAMs and I/O Pins, have

soft or hard core processing systems inside them. Even some of them have Analog to

Digital hard core device inside them which makes them to be able to implement mixed-

signal systems. For instance Altera Cortex-A9 FPGA family has Dual-core ARM

processor inside it and its Cyclone-II family has Nios-II soft core CPU. Xilinx Zynq-7000

FPGA also has ARM based processing system.

FPGA vs. ASIC13

ASICs are customized ICs14

designed for a very specific use not a general purpose

system. Modern ASICs include microprocessors, RAMs15

, ROMs16

, Flash memories,

EEPROM17

and other large blocks. ASIC is mostly used when we want to design a very

large specific system with a lot of logic that consumes low power and is too fast that

can’t be implemented in FPGA. In fact ASICs are designed to be fully optimized in

aspects of logics, gates, power, and area [19] [20].

FPGAs and ASICs both implement complex designs at a high level of

performance. They both use HDL18

languages like VHDL and Verilog to implement the

13

Application-Specific Integrated Circuits 14

Integrated Circuits 15

Random Access Memories 16

Read-Only Memories 17

Electrically Erasable Programmable Read-Only Memories 18

Hardware Description Language

15

logic. But each of them has some advantages and disadvantages in comparison to other

one and designers should pick them based on their need.

FPGA design has no layout, masks or other manufacturing steps, so it has a faster

time to market. You don’t need to be worried about the NRE19

which is cost of

development and also cost of manufacturing. They have simpler design cycle which is

due to the software which handles most of the routing, timing, and placement parts.

These are the design process parts that take most of the design and development time

which is eliminated in FPGA design process. Reprogramability is one other feature in

FPGA design. Any new bitstream can be uploaded immediately and there is no extra time

and cost for that while in ASIC design it can take $50000 or more and about a month to

do the same thing. Reusability is an essential advantage of FPGA, you could make your

prototype on a FPGA and if there is any bug reprogram it and retest it again. FPGAs are

good for small volume designs and also the power consumption is more than ASIC, you

are also limited to the existing resource inside the FPGA [21][22][23].

ASIC design has a lower unit cost for very high volume production. If the volume

is really high like more than 250 K logic density then ASIC would have less cost than

developing using FPGA. Using ASICs we could have a full capability of custom design.

For designs which low power and also high speed is crucial, ASIC could be a good

choice. You are not also limited in the amount of logic and your design could be as big as

you want. Because of having design flexibility in ASIC, it let us to have more speed

optimization. The other feature of ASIC design is to have the ability of implementing

19

Non-Recurring Engineering

16

analog design and mixed-signal design inside the ASIC. Although, FPGAs are also going

toward having the mixed-signal design ability[24].

FPGA Architecture

FPGAs are programmable Logic devices with CLBs20

which make designer’s desired

system using programmable interconnects. There are some OTP21

FPGAs available but

most of them are SRAM22

based and could be programmed several times during the

design development. The feature of programmability let engineers to change and modify

their design easily during the design process, even after they have manufactured their

product the modification and adding features are possible. There wouldn’t be extra cost

for the modification in compare to the ASIC design. The architecture of a FPGA consists

CLBs, programmable interconnects, I/O blocks, and Block RAMs. CLB is one of the

basic units in FPGA and each FPGA has certain number of CLBs based on its size

[25][26][27].

A basic CLB has configurable switches and some logic cells. Each logic cell

have some LUTs, shift registers, multiplexers and flip flops to be able create a

combinational or sequential design using those configurable switches. Programmable

interconnects provide the routing between CLBs, CLBs and I/O blocks, and also clock

routing inside the system. I/O blocks provide the connection from outside to inside of the

FPGA. There are different I/O banks throughout the FPGA and each supports some kind

of standard I/Os. Number of I/Os in new FPGAs has increased a lot. Block RAMs are

20

Configurable Logic Blocks 21

One Time Programmable 22

Static Random Access Memory

17

used to generate memory elements inside the design, so we could have on-chip memory

available for the design. Figure 11 to 14 illustrate FPGA architecture and some details

about interconnects, CLBs and logic cells [25][26][27].

Figure 11. FPGA architecture

18

Figure 12. Programmable Interconnect details

Figure 13. A basic CLB structure

19

Figure 14. Logic Cell structure

Using all Interconnects, CLBs, switches, and Block RAMS, the logic and routing

inside the FPGA could be very simple like a counter or a very complicated one like a

processor. Whenever a bit file is loaded inside the FPGA, each CLB implements a

particular logic using LUTs and then interconnect switches connect CLBs together to

make the whole system. Programming techniques are different in FPGA. SRAM, anti-

fuse, and EPROM are some of them that are currently used in FPGAs. In SRAM model

switches are controlled by SRAM bits. In SRAM based programming, the FPGA will

lose its data after we turn off the system. In anti-fuse model, by programming the anti-

fuses we make a low resistor path and the advantage of anti-fuse is the small size that it

has. Finally using EPROM to program, the switch is a floating gate which could be

turned off by injecting charge to it. An important feature of the EPROM is that after

20

power off FPGA doesn’t lose its programmed data and when we turn on the system it still

has its programmed data [25][27].

FPGA design process

The process of implementing a design on a FPGA can be divided into different

steps. These steps include design entry, test development, behavioral simulation, design

synthesis, functional simulation, place and route, timing analysis, post-synthesis

simulation (timing simulation), FPGA programming, on hardware debug and verification.

Figure 15 shows the process diagram[28].

21

Figure 15. FPGA design process

Design Entry

Timing Analysis

Timing

Simulation

Design Synthesis

Place, Route,

Implementation

Behavioral

Simulation

Functional

Simulation

Hardware Debug

FPGA

Programming

Test Development

Done

No

No

No

No

Yes

22

Design entry

The first step in FPGA design is design entry. In this step designers convert the

design ideas into a state machine, HDL codes, or schematic design. It depends on

designer to choose one of these methods based on the design needs. Hardware

Description Languages like VHDL and Verilog are the most commonly used ones. They

have the ability of designing very complex systems or a very simple one. As they are

used to design hardware, they have the parallel design feature and they are not like

regular software programs. HDL are one the best methods for design entry among others,

because they give the flexibility to designer to port their designs to other workspaces

easily, while schematic design entry isn’t flexible at all and makes it hard to port the

design to other platforms [29][30].

Test development

After the design entry is ready and we have our design in the format of an HDL

code or schematic, now it’s time to test if it works as it’s supposed to. To check and

verify the functionality of the design we need to have different test cases that check and

cover different parts of the system. Test cases should be comprehensive so we find and

resolve most of the system problems before we get to the hardware test. TCL23

scripting

language or other similar tools might be used to write test cases for the design.

23

Tool Command Language

23

Behavioral simulation

When we have the test case, we can use different simulation tools like Modelsim,

Aldec Riviera, and other similar tools to simulate the design behavioral and check if it

works properly. At this stage we do RTL24

simulation, because there are different levels

of simulation in the path of FPGA design and this is the first one. Behavioral simulation

is a high level simulation and doesn’t consider any actual gate delay when it simulates.

Doing behavioral simulation we could find as many bugs as possible and when we are

confident that the design is working fine, we’ll continue to the synthesis step. We’ll go

back to the code and do modification till the simulation passes with all test cases [29].

Design synthesis

One of the main steps in FPGA design is the synthesis. In this step the synthesis

tool converts our high level behavioral HDL code to a netlist of real logical primitives

offered by the vendor tool. Synthesis tool uses different optimization methods to make

the netlist as efficient as possible [80]. There might be some un synthesizable code styles

inside the HDL code which cause the tool to give us error, so the designer should go back

to the code and do the modification till the synthesizer is done successfully.

Place and route

After the synthesis is done, we have to use a tool to do the place and route. Most

of the current tools have both synthesizer and place and route tool at the same software.

24

Register Transfer Level

24

In this stage the tool will get the generated netlist by synthesis tool and a constraint file

and try to fit the design inside the target FPGA device while it meets the constraint. It’ll

interconnect all the primitives together to make the timing requirement. The most

important constraints are speed and delay. If it couldn’t make get to the performance that

we look for it’ll give us constraint violations and designers need to go back to design and

make it more efficient to pass the place and route with no error [29][30].

Timing analysis

Timing analysis tool check the design after place and route to check if all the

timing requirement are met. If there are some parts that don’t meet the timing constraint

it’ll report, so we have to go back to the synthesizer or even to the code and change and

modify it till it passes the timing analysis.

Post-synthesis simulation (timing simulation)

After the timing analysis is done we have a netlist which consists of primitives

with their real timing specifications and also all existing delays and path in the system.

To verify that the timings doesn’t cause any functional issue we have to use this netlist to

do a post synthesis simulation or timing simulation that we’ll consider all timing while

doing simulation. If the simulation doesn’t pass, designers should go back to the code and

try to modify the system to resolve the issue [29].

25

FPGA programming

After the timing simulation, the last step is to test the real hardware to make sure

there is no issue remained. To be able to debug on the hardware first we have to use the

programmer tool to load the generated bit file on the FPGA. Modern FPGAs have a

JTAG25

port that can be used to program and test the FPGA.

Hardware debug and verification

This is the last step in FPGA design process. After programming the FPGA using JTAG

port, now we can use different tools like logic analyzer to debug and test the real

hardware while it’s running. Some software like Quartus II and ISE has their own

internal Logic analyzer that designers could use to add internal signals and check their

value while the hardware is running. Debug on hardware because of the memory and pin

limitation of the logic analyzer is hard and time consuming, so the best way of test is to

first try to do it mostly using simulation. Because in simulation we have all the signals

available and tracing down the problem root is much easier than hardware.

25

Joint Test Action Group

26

Chipscope Xilinx test and debug tool

An introduction to chipscope

As mentioned in previous section, synthesis tools also have an option to add their

internal logic analyzer to your FPGA design and test your hardware using this embedded

logic analyzer. Chipscope is the internal logic analyzer for Xilinx Synthesis tools. When

you use chipscope it adds logic_analyzer, system analyzer, and Virtual I/O core to the

design allowing you to see your internal signals. Signals will be captured using the

system clock and displayed on the tool display. The key features of this tool include

[31][32]:

- Fast and easy way of setup

- Uses JTAG to interface with hardware and no other pins are required

- Ability of adding debug ports directly in the HDL code

- Analyze all internal signals even signals for embedded rocessor

- Chipscope core insertion is in the tool flow

- It provides full internal visibility

- Minimize number of external pins required for debug purposes

There are limitations in using chipscope too. The main reason is being embedded

inside the FPGA. These limitations are [33]:

- To have limited amount of sample memory. This tool is embedded in the design

and will use the rest of the remaining logic and memory in FPGA to capture

signals. So chipscope available resources depend on the size of the design and

27

FPGA itself. So in a design that we use most of the memory, there might not be

enough memory to add a chipscope too.

- Chipscope can’t sample as fast as a real logic analyzer, because its sampling rate

is also limited to the design clock rate, so it’s not possible for chipscope to show

glitches in the design.

Chipscope structure

To add a chipscope to your design you need to have three different cores

available. These three cores include ILA, ICON, and VIO cores.

ILA core

The first and main core to add is ILA core. It will make the flow between project

and chipscope core. We have to add the ILA beside the design and connect all trigger

signals (signals that we want to monitor) to it. ILA is actually the capture core that

captures signal values and sends them to be displayed. It’s a customizable logic analyzer

core that monitors the signal within the design. It has many features that are close to a

real logic analyzer like storage, trigger conditions. User can select the triggers width,

depth and data. It has multiple trigger ports and the trigger condition causes the core to

store the sample just when it meet the trigger condition [34][35].

28

ICON core

Icon core is an interface between JTAG on FPGA and other chipscope cores like

ILA and VIO. This core provides a communication path between chipscope software and

ILA, VIO cores using JTAG and it supports up to 15 core connections [36].

VIO core

VIO is the other core that we need to connect to make the chipscope ready for the design

test. This is a customizable core that will monitor and also drive the FPGA signals. It also

has detectors to detect rising and falling edges of the samples. It provides virtual LEDs

and other indicators through inputs and virtual buttons and controls though output ports

[37].

How to connect the chipscope cores and setup the test system

The first step before using the chipscope is to have a compiled project ready in your ISE

Design Suit. Then we have to instantiate ILA core beside the top module to be able to

connect triggers to it. Triggers are the signals that we would like to monitor using

chipscope. We might need to modify the top module and take out the internal signals that

we need to monitor. Every signal that we want to monitor should come out of our top

module. Based on the number of signals that we want to monitor, we generate an ILA

debug core with the exact number of trigger ports and we have to define the same width

for triggers and their correspondent signals [34][38][39].

29

After that we will generate VIO and ICON core and connect them together. Figure 16

shows the connection between these three debug cores and the main top design core.

When the whole connection is there we have to compile, synthesis, and implement the

whole project again. After the compilation is done and the generated bit file is ready, we

just reprogram the FPGA using new bit file and then use the option “analyze design using

chipscope” to bring the chipscope page up. In trigger page we can go to the signal tab and

it brings all existing triggers. We could change their name based on the signal names and

then add trigger conditions to the trigger setup page. After adding trigger conditions that

we want to check, we’ll run the chipscope and it will be triggered and show the signals

value when the condition is met [34][38][39].

Figure 16. Chipscope debug cores connection to core under the test

30

Another existing feature in chipscope is the ability of showing the analog format

of the signals. For instance, if we have an input or output which is a sine wave we could

go to bus plot tab and see the signal as an analog wave instead some digital values that

we can’t get anything from. To make the signal look like a real waveform, we should

change the bus radix to decimal and then run the chipscope to see the waveform. As

mentioned before, when the chipscope is triggered the result will go to the memory of

chipscope, we have the ability of changing the size of memory to make it bigger for

having more time range of triggered data available [33][34].

31

Spectrogram implementation using Matlab and FPGA

spectrogram system implementation using Matlab

As mentioned in previous chapters, one of the ways to implement spectrogram is

using FFT. The base core is an FFT module and to build a spectrogram from FFT core,

we just have to calculate the amplitude of the result to have the spectrogram output. As

the first step, I started using Matlab to have the reference working system to compare the

FPGA result with it to make sure the implemented system on FPGA is working as it’s

supposed to. Matlab has its own FFT function, but to get more familiar with the FFT

method, I’ve also written another Matlab code that implements the FFT by its formula.

So at the end there were two different spectrogram Matlab systems, one which used the

Matlab FFT fundtion and the other which implement FFT first and then use it to have the

spectrogram output.

As audio files are not that kind of high frequency waves, the decision was to use a

1024-point FFT core. Figure 17 shows the Matlab spectrogram system using Matlab FFT

function and self-implemented FFT function and how to compare their result to make

sure the implemented formula is working exactly like Matlab function.

Sum of a 50 Hz and 120 Hz sinusoid waveform was given to both of them as

input to check their functionality. Based on the input that was given I knew that I have to

get just two frequency picks as the result and nothing else. Figure 18 shows the result

from both modules. As you see the result is almost the same. For another test example, I

added noise to the same input and get the result that you can see in figure 19.

32

Figure 17. Matlab code written to compute FFT on a combined Sine wave

33

Figure 18. FFT computed using Matlab function and self-implemented one no noise at input

34

Figure 19. FFT computed using Matlab function and self-implemented with noise added to input signal

35

As the final plan was to have different audio files to run and get the result from

spectrogram to compare them for a conclusion, I needed to make this Matlab code more

flexible to be able to have different inputs easily. To make our code more flexible

somehow that we could give it whatever input that we would like to, I’ve added reading

from text file as the input processing part of the code. Now I can give different kind of

input sample text files to it easily. Figure 20 shows the modification to the code for this

purpose.

Figure 20. Matlab code modification to read from a text file to load input of FFT

36

The plan is to process different music files, so one of the things needed is a

module that could read different music files and convert them to binary to be processed

by Matlab spectrogram and FPGA. For FPGA usage purposes, it has to generate the file

in the format of coe file which is RAM initialization file format. So another Matlab script

was written to read a wave or au file and covert it to binary and write it into a text file.

Figure 21 shows the Matlab code for this module.

Figure 21. Matlab code that generates the input file for Matlab and FPGA spectrogram system

37

Based on researches, animals generate different sounds with different frequencies

in each situation. For example the frequency of the sound when a whale screams is

different from its moan call. So spectrogram can be used to detect these different sounds.

For one part of my experiment I gave different animal and birds sound and get the

spectrogram result from it. Figures 22 and 23 Show the spectrogram result for two

different whales. As you can see, they just have frequencies in some area not all the

spectrum. In figure 24, 25, and 26 you also can see the spectrogram of the sound of some

different birds. Now that we have the reference system, the next step is to implement the

FPGA system and see how close it works in compare to the Matlab code that we have.

Figure 22. Spectrogram of Blueatlx Wale sound

38

Figure 23. Spectrogram of BluePacx Wale sound

Figure 24. Spectrogram of Eaglet bird sound

39

Figure 25. Spectrogram of Falcon bird sound

Figure 26. Spectrogram of Mallard Duck quacking sound

40

FPGA based spectrogram system

The target FPGA is a Xilinx Spartan-6 XC6SLX45 FPGA on the Digilent ATLYS

evaluation board. So the software used to compile and implement the system is ISE from

Xilinx Company. VHDL is the RTL language that has been used to implement the

system. Figure 27 shows the structure of the system that has to be implemented.

Figure 27. Block diagram of the spectrogram system implemented in FPGA

There is a 1024 point FFT core which gets its real and imaginary input from two

ROM and after its output is ready we use an amplitude calculator to generate the final

output. There are some input and control signals to make the FFT core works; figure 28

illustrates the timing diagram of the control signals in relation to the imaginary and real

main inputs[40]. All the control signals are generated in the main top module.

41

Figure 28. Timing diagram of the control signals of the FFTcore

Start control signal should have a pulse before the first input sample is valid.

After we send a pulse on start, we have to send all 1024 input samples out one by one at

rising edge of clock. FFT core also after start will count the input index till it gets to the

1024 which is the last sample. Then the busy signal gets high which shows that system is

working and not ready to receive any new input yet. After FFT is done out_valid signal

will go high and output_index will show the index of output sample till it gets to 1024

which is the last output sample. After out_valid goes down we are done with the

transform and we could start another one by sending another start signal to the core. In

figure 29, 30, 31, and 32, you can see the simulation result for the system that show how

the timing signals have been generated and system has worked and produced the output

signals. They also indicate the output control signals generated by simulated FFT after we

gave the input control signals to it. The only thing that this image shows is the correct

way of generating timing signals and for the output check we’ll see some waveforms

generated by chipscope later.

42

Figure 29. FFT Timing input signals simulation result, generated by the VHDL code (part 1)


43



44

FFT Core Implementation in FPGA

There are different implementation options for FFT core on Xilinx FPGAs.

Pipelined Streaming I/O is the one that we've used for our design. This structure offers

continuous processing by using several butterfly processing engines. each butterfly

engine has its own memory to store the input and intermediate processed data. because of

its structure it has the ability of loading input data for the next frame, do the process for

the current frame and unload the result at the same time. Users can continuously load data

and after the output latency continuously receive data from the core. This is one of the

advantage of having pipelined processing system. The other way that it can process the

data is frame by frame with gap between each data frame. in figure 33, you can see the

structure foe this FFT core. This architecture covers FFT point size from to 65536[40].

Figure 33. pipelined streaming IO FFT core implementation in Xilinx FPGA family

45

Now that we have the timing signals correctly generated, we need to check if the

result is like Matlab (our reference system). To be able to check the FPGA system, we

need to setup the chipscope to be able to see the output amplitude in the analog format.

All ILA, ICON, and VIO cores have been added and connected to the main core. For ILA

core we need to determine how many trigger signals we are going to monitor. By trigger

signal, it means the signals that we want to debug or monitor and see if they are working

fine. When we add a trigger to ILA, we also need to define the length of each trigger. I’ve

added about 10 trigger signals like, start, out_valid, busy, input_real, input_imaginary,

output_amplitude, out_index, and input_index. Figure 34 shows the block diagram of the

FPGA system with chipscope cores added for debug purposes.

Now we could give the same input samples that we gave to the Matlab systems

and compare their result. As mentioned we use ROMs to store input samples and then

send them to the FFT core. To initialize a ROM we need to generate a COE file that has

all samples in. We use the output of mentioned Matlab code to initialize the ROM. The

first inout sample given to the FPGA was the samples of the combined Sine wave that we

gave at first to MATLAB. In figures 35 and 36 you could see the result from Matlab

codes and FPGA respectively. This is the fft for the sum of one 50 Hz and 120 Hz

sinusoid waveforms that we had before. Figure 37, 38, 39, and 40 are some other

examples. After running some different input samples, we can say FPGA system is

working fine like our Matlab systems.

46

Figure 34. Block diagram of the system with chipscope cores connection

47

Figure 35. chipscope spectrogram result for sin(50Hz) + sin(120Hz)

Figure 36. Matlab systems spectrogram result for sin(50Hz) + sin(120Hz)

48

Figure 37. chipscope spectrogram result for Eaglet Bird sound

Figure 38. chipscope spectrogram result for Falcon Bird sound

49

Figure 39. chipscope spectrogram result for Mallard Duck quacking

Figure 40. chipscope spectrogram result for horned Owl sound

The experiment that we plan to do using spectrogram is have different audio

samples from different music styles and instruments, compare their result together to see

if we could find an obvious difference between them to use spectrogram to find an

50

especial music type or music instrument out of others without listening to them. The

other thing is to have two different plays of the same music to see if there is any

frequency difference between them. One more experiment is to have music samples in

the same music style but from two different musicians to see if you could distinguish

between them using spectrogram without listening to them.

51

Audio processing using implemented spectrogram

The purpose of implementing the spectrogram is to process music files in

different categories to find out if we could differentiate between them by looking at the

spectrogram result without lessoning to them. As it’ll take a lot of time to process all the

files using FPGA-based one, we’ve used the Matlab version to do the experiment.

The issue is that there are a lot of audio files to process and for each one we have

to run the code 7-8 times so we’ve applied the spectrogram to at least 7-8 part of the file

to have a more precise result. Even if the Matlab runs so fast, it’ll take time to change the

samples numbers, change the file name and run and save the plot each time, so another

script has been written that reads all files automatically, apply the process on different

parts of each file and then plot the result and also save it in the workspace folder, so we

can compare them when all of them are done. Figure 41 and 42 show the written Matlab

module for this purpose.

52

Figure 41. Matlab module to read, process, and save the result automatically (part1)

53

Figure 42. Matlab module to read, process, and save the result automatically (part2)

54

Different music styles that have been tried include Rock, Jazz, Heavy Metal, Rap,

Country music, Techno, and Classic. The other category is different Solo music like Solo

Drum, Solo Guitar, Solo Piano, and Solo Violin. By processing the second category we

wanted to see if there is any obvious difference between different instruments. Final

category is the one that we have the same music with different plays to see if there is any

difference between two different plays. Figures 43- 78 show some samples of

spectrogram result for each music style or instrument.

55

Figure 43. Spectrogram result for Solo Piano (sample 1)

Figure 44. Spectrogram result for Solo Piano (sample 2)

56

Figure 45. Spectrogram result for Solo Guitar (sample 1)

Figure 46. Spectrogram result for Solo Guitar (sample 2)

57

Figure 47. Spectrogram result for Solo Saxophone (sample 1)

Figure 48. Spectrogram result for Solo Saxophone (sample 2)

58

Figure 49. Spectrogram result for Solo Violin (sample 1)

Figure 50. Spectrogram result for Solo Violin (sample 2)

59

Figure 51. Spectrogram result for Solo Drum (sample 1)

Figure 52. Spectrogram result for Solo Drum (sample 2)

60

Figure 53. Spectrogram result for Solo Flute (sample 1)

Figure 54. Spectrogram result for Solo Flute (sample 2)

61

Figure 55. Spectrogram result for Heavy Metal music (sample 1)

Figure 56. Spectrogram result for Heavy Metal music (sample2)

62

Figure 57. Spectrogram result for RAP music (sample 1)

Figure 58. Spectrogram result for RAP music (sample 2)

63

Figure 59. Spectrogram result for Country Music (sample 1)

Figure 60. Spectrogram result for Country Music (sample 2)

64

Figure 61. Spectrogram result for ROCK music (sample 1)

Figure 62. Spectrogram result for ROCK music (sample 2)

65

Figure 63. Spectrogram result for JAZZ music (sample 1)

Figure 64. Spectrogram result for JAZZ music (sample 2)

66

Figure 65. Spectrogram result for Techno music (sample 1)

Figure 66. Spectrogram result for Techno music (sample 2)

67

Figure 67. Spectrogram result for classical music from Beethoven (sample 1)

Figure 68. Spectrogram result for classical music from Beethoven (sample 2)

68

Figure 69. Spectrogram result for classical music from Tchaikovsky (sample 1)

Figure 70. Spectrogram result for classical music from Tchaikovsky (sample 2)

69

Figure 71. Spectrogram result for classical music from Mozart (sample 1)

Figure 72. Spectrogram result for classical music from Mozart (sample 2)

70

Figure 73. Spectrogram result for classical music from Vivaldi (sample 1)

Figure 74. Spectrogram result for classical music from Vivaldi (sample 2)

71

Figure 75. Spectrogram result for classical music from Bach (sample 1)

Figure 76. Spectrogram result for classical music from Bach (sample 2)

72

Figure 77. Spectrogram result for classical music from Schubert (sample 1)

Figure 78. Spectrogram result for classical music from Schubert (sample 2)

73

Spectrogram Audio result analysis

Based on the experiment that has been done, it seems like each instrument has just some

frequencies and in a mixed music the frequency result depends on what kind of

instrument are being played at that time. If you look at the spectrogram result for solo

music, it’s obvious that Drum is one of the instruments which cover a wider range of

frequencies, so in Jazz, Rock, Heavy Metal, Country or other music types that use drum

we’ll see this wide range of frequency in the output of spectrogram. Piano is one of the

instruments which has low range of frequencies, violin also has wider range of

frequencies than piano or other classical instruments. So in a classical mixed music if the

spectrogram has wider range of outputs we could conclude that Violin is also being

played.

In classical music the frequency coverage in instruments from low to high include Piano,

Saxophone, flute, and Violin. But even Violin coverage is much lower than Drum, so it

could be used to differentiate between classical and other music types. In every music

style we could approximately tell what kinds of instrument have been used using the

spectrogram output. Because Classical music instruments are low frequency ones, you’ll

see that in classical music spectrogram we don’t see those high frequencies that exist in

Rock, Jazz or other music types that uses drum.

One of the experiments is to compare two different plays of an identical music to see if it

makes any difference or not. Figure 79 shows the result for adoration of the earth and

figure 80 illustrates the result for kiss of the earth in Matlab. By looking at the result they

look almost the same and the parts that are different have really small amplitude that can

be ignored in comparison to other frequencies. So the conclusion is

74

different plays can’t cause the result to be different unless they use different instruments.

Figure 79. Two different plays of Adoration of the earth

75

Figure 80. Two different plays of Kiss of the earth

76

Image processing using 2D FFT Matlab and FPGA

2D FFT implementation using Matlab

The other part of the project is to implement 2D FFT to be able to process images and

see the effect of the FFT on them. Our base core here is one dimensional FFT that we’ve

implemented using both Matlab and VHDL coding. The goal is to use this base core and

build the 2D FFT. The algorithm for 2D FFT using one dimensional core is to first apply

one dimensional FFT to every row of the image and write them on a matrix and then

apply the FFT to every columns of the matrix and the matrix resulted from the second

FFT application is the FFT of the whole image. Figure 81 and 82 show the Matlab code

that has been written for our 2D FFT. It also generate the initialization file for RAM in

FPGA implementation. In figure 83, 84, and 85 you’ll see the result of our 2D FFT on

some sample pictures.

77

Figure 81. Matlab module for 2D FFT on image (part1)

78

Figure 82. Matlab module for 2D FFT on image (part2)

79

Figure 83. Image and its 2D FFT result in

Figure 84. Image and its 2D FFT result in Matlab

Figure 85. Image and its 2D FFT result in Matlab

As you see in figure 83, we just have two white spots in the whole black page and

the reason is that this picture has more dc component than existing frequencies. Number

of frequencies corresponds to the number of existing pixels in an image.

80

2D FFT implementation on FPGA

After implementing Matlab code as the reference system, now we can continue on

designing the same system on FPGA. One important change that has to be done on FPGA

is exchange our ROMs with RAMs. Because we need to do different FFTs on rows and

columns, so we have to be able to overwrite the components to have the final result ready

on the same RAMs that we have inputs. The reason is that FPGA resources are limited

and we can’t use two other RAMs for the output and using the same RAM components is

more efficient and efficiency is one of the major factors that we always have to consider

when we work with FPGAs. The other thing that we have to consider is the width of the

each RAM word. Out input width is 12 bits but when we apply the FFT the result is

going be bigger, so we have to instance our RAMs with bigger word width so we could

fit the output there.

Because we have to apply FFT to all rows and then columns of our image, so there

should be a control unit that generate RAM address and FFT control signals and write the

results in the correct RAM location till we have the final result ready. First we have to

have a binary file that has all the image information in it. To do that, a Matlab file has

been written that reads a picture and generates image binary file for Matlab and RAM

initial file for FPGA. Figure 86 and 87 show the Matlab code.

81

Figure 86. 2D FFT Matlab code (part 1)

82

Figure 87. 2D FFT Matlab code (part 2)

83

All image information is in two RAMs, but as the RAMs are not like Matrix so

we have to store the image in series and have a formula for the row and column

addresses. The relation between each row and its relevant RAM address is like below:

row address = (i*64)+j (i = 0 to 63 and for each i, j changes from 0 to 63)

column address = (j*64) + i (i = 0 to 63 and for each i, j changes from 0 to 63)

for row = 0 components are in address 0 to 63, and for row one they start from 64 to

127, for column = 0 components are in addresses 0, 64, 128, 192, 256, ..., and 4032, and

next column addresses is 1, 65, 129, 193, 257, ..., and 4033. The major job that the

control unit does is to generate the start signal and then generate the Ram read address

based on the row number and read the whole RAM, then wait for the FFT on the row is

done and generate the same addresses to overwrite the result on the same row. Then do

the same for the next row till the last row. When the process on rows is done, it will

generate addresses for each column and write the result on the same columns again and

when the columns are done it’ll generate a signal which indicates the end of conversion.

Because of the limited number of resources that we have on the used FPGA, 64 x 64

images has been used, so control unit has to do the transform 64 x 64 times till the result

is ready. Figure 88, 89, 90, and 91 show some detail about simulation result for the image

processing using our designed FPGA-based system and the way that the system does the

process. after running the simulation two text files are generated and we give it to Matlab

to show it as image, so we can check the result.

84

Figure 88. fft implementation usinf VHDL (part1)

85

Figure 89. 2D fft implementation usinf VHDL (part2)

86


87


88

Conclusion

Because of its efficiency, FFT is one of the important introduced transform

algorithms. It has different application like image processing, noise cancelation, image

quality improvement, and audio processing. FFT on images show the existing frequencies

that are significant in compare to the DC component of the image. Number of frequencies

exist in the result depends on the number of pixels in the picture. For instance, if we have

an image of just parallel white and black lines we'll have two white dots on the whole

black background as the result. Because it just has two pixels and their amplitude is much

lower than DC components in the image. So almost all the point are black except those

two frequencies.

One other application of the FFT is audio processing. Spectrogram is a tool that

uses FFT as the main core to generate the amplitude of the existing frequencies in the

input waveform. By applying spectrogram to different animal's voice, we can

differentiate between their different calls like scream, or moan. It's because each call has

its own frequency pattern.

Spectrogram also has been applied to different solo musical instrument audio files

like Guitar, Piano, Violin, and Drum. By comparing the result we could see that each

instrument has kind of especial frequency range that can be used to detect that particular

instrument without listening to the audio file. The other benefit could be that by having

the result of the spectrogram on different part of a mixed audio file we could detect some

of the instruments that have been played.

89

Drum is one of the instruments that has a really high frequency range, violin is the

high frequency range in classical music, but its range is much lower than drum. So

whenever in a mixed signal we see a really high range of the frequency change we could

say Drum is played there. Or if we see that the result just falls into the lower frequency

ranges we can consider it as the piano. the other usage could be as a detector for classical

music, because we don't use drum in classical music, so its spectrogram result is always

lower than others even if we use violin (which is one of the high frequency range

instruments for classical music).

90

References

1. http://en.wikipedia.org/wiki/Fourier_transform , November 2014

2. http://www.cv.nrao.edu/course/astr534/FourierTransforms.html , September 2010

3. http://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/ , December

2012

4. http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf , August 2007

5. http://en.wikipedia.org/wiki/Discrete_Fourier_transform , November 2014

6. http://en.wikipedia.org/wiki/Fast_Fourier_transform , November 2014

7. http://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm , November 2014

8. http://sip.cua.edu/res/docs/courses/ee515/chapter08/ch8-2.pdf , July 2012

9. https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/ , August 2013

10. http://www.wisdom.weizmann.ac.il/~naor/COURSE/fft-lecture.pdf , August 2005

11. http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf , August 2007

12. http://perso.limsi.fr/vezien/PAPIERS_ACS/cse-fft.pdf , October 1999

13. http://www.cs.princeton.edu/courses/archive/fall99/cs323/assign/ass5/ass5.pdf , August 1999

14. http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm , October 2003

15. https://www.projectrhea.org/rhea/index.php/Speech_Spectrogram , September 2009

16. http://en.wikipedia.org/wiki/Spectrogram , November 2014

17. http://www.listenforwhales.org/page.aspx?pid=444 , April 2013

18. http://en.wikipedia.org/wiki/Field-programmable_gate_array , November 2014

19. http://en.wikipedia.org/wiki/Application-specific_integrated_circuit , August 2014

20. http://community.brocade.com/t5/Service-Providers/FPGA-or-ASIC-Pro-s-amp-Con-s-of-

Each-Technology/ba-p/709 , March 2013

21. http://asic-soc.blogspot.com/2007/11/what-is-difference-between-fpga-and_06.html ,

November 2007

22. http://www.xilinx.com/fpga/asic.htm , February 2014

http://en.wikipedia.org/wiki/Fourier_transform

http://www.cv.nrao.edu/course/astr534/FourierTransforms.html

http://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/

http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf

http://en.wikipedia.org/wiki/Discrete_Fourier_transform

http://en.wikipedia.org/wiki/Fast_Fourier_transform

http://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm

http://sip.cua.edu/res/docs/courses/ee515/chapter08/ch8-2.pdf

https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/

http://www.wisdom.weizmann.ac.il/~naor/COURSE/fft-lecture.pdf

http://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf

http://perso.limsi.fr/vezien/PAPIERS_ACS/cse-fft.pdf

http://www.cs.princeton.edu/courses/archive/fall99/cs323/assign/ass5/ass5.pdf

http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm

https://www.projectrhea.org/rhea/index.php/Speech_Spectrogram

http://en.wikipedia.org/wiki/Spectrogram

http://www.listenforwhales.org/page.aspx?pid=444

http://en.wikipedia.org/wiki/Field-programmable_gate_array

http://en.wikipedia.org/wiki/Application-specific_integrated_circuit

http://community.brocade.com/t5/Service-Providers/FPGA-or-ASIC-Pro-s-amp-Con-s-of-Each-Technology/ba-p/709

http://community.brocade.com/t5/Service-Providers/FPGA-or-ASIC-Pro-s-amp-Con-s-of-Each-Technology/ba-p/709

http://asic-soc.blogspot.com/2007/11/what-is-difference-between-fpga-and_06.html

http://www.xilinx.com/fpga/asic.htm

91

23. http://only-vlsi.blogspot.com/2008/05/fpga-vs-asic.html , May 2008

24. https://www.doc.ic.ac.uk/~wl/teachlocal/arch2/killasic.pdf , January 2001

25. http://www.cis.upenn.edu/~lee/06cse480/lec-fpga.pdf , July 2006

26. http://www.xilinx.com/fpga/ , August 2014

27. http://isl.stanford.edu/groups/elgamal/abbas_publications/J029.pdf , February 2995

28. http://www.xilinx.com/itp/xilinx10/isehelp/ise_c_fpga_design_flow_overview.htm , June 2008

29. http://amber.feld.cvut.cz/fpga/stazene_materialy/basics_of_fpga_design.pdf , December 2003

30. http://cds.cern.ch/record/1100537/files/p231.pdf , June 2007

31. http://www.xilinx.com/tools/cspro.htm , April 2014

32. http://www.arl.wustl.edu/projects/fpx/chipscope-6-rev1.pdf , July 2004

33. http://www-mtl.mit.edu/Courses/6.111/labkit/chipscope.shtml , February 2007

34. http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_3/ug750.pdf , November

2010

35. http://www.xilinx.com/products/intellectual-property/chipscope_ila.htm , August 2014

36. http://www.xilinx.com/products/intellectual-property/chipscope_icon.htm , August 2014

37. http://www.xilinx.com/products/intellectual-property/chipscope_vio.htm , August 2014

38. http://www-inst.eecs.berkeley.edu/~cs150/fa13/resources/ChipScope.pdf , February 2009

39. http://www.ee.ryerson.ca/~lkirisch/ele758/handouts/Tutorial3_ChipScope_Pro_VIO_BlockRA

M.pdf , October 2012

40. XILINX LogiCORE IP Fast Fourier Transform v7.1 datasheet, March 2011

http://only-vlsi.blogspot.com/2008/05/fpga-vs-asic.html

https://www.doc.ic.ac.uk/~wl/teachlocal/arch2/killasic.pdf

http://www.cis.upenn.edu/~lee/06cse480/lec-fpga.pdf

http://www.xilinx.com/fpga/

http://isl.stanford.edu/groups/elgamal/abbas_publications/J029.pdf

http://www.xilinx.com/itp/xilinx10/isehelp/ise_c_fpga_design_flow_overview.htm

http://amber.feld.cvut.cz/fpga/stazene_materialy/basics_of_fpga_design.pdf

http://cds.cern.ch/record/1100537/files/p231.pdf

http://www.xilinx.com/tools/cspro.htm

http://www.arl.wustl.edu/projects/fpx/chipscope-6-rev1.pdf

http://www-mtl.mit.edu/Courses/6.111/labkit/chipscope.shtml

http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_3/ug750.pdf

http://www.xilinx.com/products/intellectual-property/chipscope_ila.htm

http://www.xilinx.com/products/intellectual-property/chipscope_icon.htm

http://www.xilinx.com/products/intellectual-property/chipscope_vio.htm

http://www-inst.eecs.berkeley.edu/~cs150/fa13/resources/ChipScope.pdf

http://www.ee.ryerson.ca/~lkirisch/ele758/handouts/Tutorial3_ChipScope_Pro_VIO_BlockRAM.pdf

http://www.ee.ryerson.ca/~lkirisch/ele758/handouts/Tutorial3_ChipScope_Pro_VIO_BlockRAM.pdf

92

Appendix A

Matlab code for reading WAV audio file format and generate ROM initialization file and

matlab binary input file to calculate FFT

% gives us a double array of 1024 * 2 for left and right channel

samples [x, fs, nbits]= wavread('Rudy_rooster_crowing-Shelley-1948282641.wav',

[16385, 17408] );

% get the integer value of samples x2 = round(x * 2^11); %( x * 2^nbits/2) + (2^nbits/2);

left_x = x(:,1); % save left channel samples in double left_x2 = x2(:,1); % save left channel samples in integer

% make the ceo file to initialize the ROM fileID = fopen('rom_input_Rudy_rooster_crowing-Shelley-1948282641.txt',

'w'); i = 1;

fprintf(fileID,'memory_initialization_radix= 10; \n'); fprintf(fileID,'memory_initialization_vector= \n');

while i <= 1023 fprintf(fileID,'%d,\n',left_x2(i)); i = i + 1; end fprintf(fileID,'%d;\n',left_x2(1024));

% make the input text file to get used by matlab to compare the result

with FPGA fileID1 = fopen('matlab_input_Rudy_rooster_crowing-Shelley-

1948282641.txt', 'w'); i = 1;

fprintf(fileID1,'%d \n', fs); %sampling frequency fprintf(fileID1,'%d \n', 1024); %number of sample inputs

while i <= 1024 fprintf(fileID1,'%d \n',left_x2(i)); i = i + 1;

93

end

Matlab code to apply FFT on the combined sinusoid waves

Fs = 1000; % Sampling frequency T = 1/Fs; % Sample time .001 L = 1000; % Length of signal L1 = 1024; % Length of signal t = (0:L-1)*T; % Time vector t1 = (0:L1-1)*1; % Time vector % Sum of a 50 Hz sinusoid and a 120 Hz sinusoid x = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t); y = x + 2*randn(size(t)); % Sinusoids plus noise infile = fopen ('input_sig.txt', 'w');

%Read the FFT result files generated by FPGA

real_file = fopen('real_out.txt'); img_file = fopen('img_out.txt'); A = fscanf(real_file,'%d'); B = fscanf(img_file,'%d'); A = A / 2048; B = B / 2048; d = A + i * B; c = sqrt(A.^2 + B.^2);

i = 1; while (i <= 1000 ) fprintf (infile, '%f \n', y(i)); i = i + 1; end

NFFT = 2^nextpow2(L); % Next power of 2 from length of y Y = fft(y,NFFT)/L; % fft calculation k=1; while (k <= 1000) l=1; fft_r (k) = 0; while ( l <= 1000) e(l) = y(l) * exp((-2*1i*pi*k*l)/1000); fft_r (k) = fft_r(k) + e(l); l = l + 1; end k = k + 1; end Y_fft_r = fft_r (1:1000)/l;

94

f = Fs/2*linspace(0,1,NFFT/2+1); % compare all three

plot(f,2*abs(Y (1:NFFT/2+1))); % FFT result using FFT matlab function

Hold on;

plot(f,2*abs(Y_fft_r(1:NFFT/2+1))); % self generated FFT matlab code

Hold on;

plot(f,c); % FFT result from FPGA

title('Single-Sided Amplitude Spectrum of y(t)') xlabel('Frequency (Hz)') ylabel('|Y(f)|')

Matlab code for reading the binary input files and plot the result for Matlab and FPGA

clear; matlab_in = fopen('matlab_input_mallard_duck_quacking.txt'); Fs= fscanf(matlab_in,'%d', 1); % Sampling frequency T = 1/Fs; % Sample time .001 Length = fscanf(matlab_in,'%d', 1); % Length of signal t = (0:Length-1)*T; % Time vector % reading the waveform input samples y = fscanf(matlab_in,'%d');

NFFT = 2^nextpow2(Length); % Next power of 2 from length of y Y = fft(y,NFFT)/Length; % fft calculation k=1; while (k <= Length) l=1; fft_r (k) = 0; while ( l <= Length) e(l) = y(l) * exp((-2*1i*pi*k*l)/Length); fft_r (k) = fft_r(k) + e(l); l = l + 1; end k = k + 1; end Y_fft_r = fft_r (1:Length)/l; f = Fs/2*linspace(0,1,NFFT/2+1); % read the rtl sim result and plot to compare with matlab real_file = fopen('real_out.txt'); img_file = fopen('img_out.txt');

A = fscanf(real_file,'%d'); B = fscanf(img_file,'%d'); A = A / 2048; B = B / 2048; d = A + 1i * B; c = sqrt(A.^2 + B.^2); c1 = A + 1i* B;

plot(f,2*abs(c1(1:NFFT/2+1)), 'b') hold on plot(f,2*abs(Y_fft_r(1:NFFT/2+1)), 'r') hold on plot(f,2*abs(Y(1:NFFT/2+1)), 'g')

95

title('Single-Sided Amplitude Spectrum of y(t)') xlabel('Frequency (Hz)') ylabel('|Y(f)|')

Matlab code for automatically read the wave files, process and save the result

%number of existing files numfiles = 8;

for k = 1:numfiles myfilename = sprintf('ROCK_SOLO_%d.wav', k); n1 = 1; n2 = 100000; for i = 1:7 [x, fs, nbits]= wavread(myfilename, [n1,n2]); x1 = x(:,1); y = fft(x1); y1 = abs(y); f = fs/2*linspace(0,1,100000/2+1); h= figure(((k-1)*7)+i); amplitude = abs(y(1:100000/2+1)); avg = 0; n= 50001; for j = 1:50001 if amplitude(j) <= 10 n = n -1; else avg = avg +amplitude(j); end end avg = avg /n; threshold = avg; for j = 1:50001 if amplitude(j) < threshold amplitude1(j)= 0; else amplitude1(j)= amplitude(j); end end plot(f, amplitude1(1:100000/2+1));

% will create CONTRY_MIX_RESULT_1_1 saveas(h,sprintf('ROCK_SOLO_RESULT_%d_%d.png',k, i));

% will create CONTRY_MIX_RESULT_1_1 saveas(h,sprintf('ROCK_SOLO_RESULT_%d_%d.fig',k, i));

n1 = n1 + 100000; n2 = n2 + 100000; end end

96

Matlab code for 2D FFT on image

% read the gray scale image I = imread('1-1.jpg');

% getting an approximate of the threshold for the image level = graythresh(I); BW = im2bw(I,level);

i = 1; while ( i <= 64) j= 1; while ( j <= 64) Y(i,j) = 0; x1(i,j) = 0; j= j + 1; end i = i + 1; end i = 1; while ( i <= 64) j= 1; while ( j <= 64) x(i,j) = BW(i,j); j = j + 1; end y = fft(x(i, 1:64)); j= 1; while ( j <= 64) Y(i,j) = y(1,j); j = j + 1; end i = i + 1; end

i = 1; while ( i <= 64)

j= 1; while ( j <= 64) x1(i,j) = Y(i,j);

97

j = j + 1; end y = fft(x1(1:64, i)); Y1(1:64,i) = y(1:64,1); i = i + 1;

end

grayImage = uint8(Y1); imshow(grayImage);

Appendix B

FFT VHDL code

LIBRARY IEEE;

USE IEEE.STD_LOGIC_1164.ALL;


USE IEEE.NUMERIC_STD.all;


use IEEE.std_logic_unsigned.all;

LIBRARY UNISIM;

USE UNISIM.VComponents.ALL;

ENTITY spectogram IS

PORT (

clk : IN STD_LOGIC ;

rst : IN STD_LOGIC;

rfd : OUT STD_LOGIC;

busy : OUT STD_LOGIC;

edone : OUT STD_LOGIC;

done : OUT STD_LOGIC;

dv : OUT STD_LOGIC;

xn_index : OUT STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

xk_index : OUT STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

xk_re : OUT STD_LOGIC_VECTOR ( 22 DOWNTO 0 );

xk_im : OUT STD_LOGIC_VECTOR ( 22 DOWNTO 0 );

amplitude: OUT STD_LOGIC_VECTOR (22 DOWNTO 0)

);

END spectogram;

ARCHITECTURE spectogram_arch OF spectogram IS

COMPONENT xfft

port (

clk : in STD_LOGIC := 'X';

98

start : in STD_LOGIC := 'X';

fwd_inv : in STD_LOGIC := 'X';

fwd_inv_we : in STD_LOGIC := 'X';

rfd : out STD_LOGIC;

busy : out STD_LOGIC;

edone : out STD_LOGIC;

done : out STD_LOGIC;

dv : out STD_LOGIC;

xn_re : in STD_LOGIC_VECTOR ( 11 downto 0 );

xn_im : in STD_LOGIC_VECTOR ( 11 downto 0 );

xn_index : out STD_LOGIC_VECTOR ( 9 downto 0 );

xk_index : out STD_LOGIC_VECTOR ( 9 downto 0 );

xk_re : out STD_LOGIC_VECTOR ( 22 downto 0 );

xk_im : out STD_LOGIC_VECTOR ( 22 downto 0 )

);

END COMPONENT;

COMPONENT ILA_CORE

PORT (

CONTROL: INOUT STD_LOGIC_VECTOR(35 DOWNTO 0);

CLK: IN STD_LOGIC;

TRIG0: IN STD_LOGIC_VECTOR(11 DOWNTO 0);


TRIG2: IN STD_LOGIC_VECTOR(0 TO 0);







TRIG9: IN STD_LOGIC_VECTOR(0 TO 0));

END COMPONENT;

COMPONENT ICON_CORE

PORT (

CONTROL0: INOUT std_logic_vector(35 DOWNTO 0);

CONTROL1: inout std_logic_vector(35 downto 0));

END COMPONENT;

COMPONENT VIO_core

port (

CONTROL: inout std_logic_vector(35 downto 0);

CLK: in std_logic;

SYNC_IN: in std_logic_vector(7 downto 0);

SYNC_OUT: out std_logic_vector(7 downto 0));

END COMPONENT;

99

COMPONENT r2p_corproc

GENERIC(DATA_WIDTH : INTEGER := 27;

PIPE_DEPTH : INTEGER := 15;

PRECISION : INTEGER := 27);

PORT( clk : IN STD_LOGIC;

ce: IN STD_LOGIC;

Xin: IN SIGNED(DATA_WIDTH-1 DOWNTO 0);

Yin: IN SIGNED(DATA_WIDTH-1 DOWNTO 0);

Rout : OUT unsigned(DATA_WIDTH-1 DOWNTO 0));

END COMPONENT;

COMPONENT mem

PORT (

clka : IN STD_LOGIC;

addra : IN STD_LOGIC_VECTOR(9 DOWNTO 0);

douta : OUT STD_LOGIC_VECTOR(11 DOWNTO 0)

);

END COMPONENT;

COMPONENT clock_divider_DCM

PORT

(-- Clock in ports

CLK_IN1 : in std_logic;

-- Clock out ports

CLK_OUT1 : out std_logic;






-- Status and control signals

RESET : in std_logic;

LOCKED : out std_logic

);

END COMPONENT;

SIGNAL control_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL contro2_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL dv_temp : STD_LOGIC;

SIGNAL xk_re_temp : STD_LOGIC_VECTOR(22 DOWNTO 0);

SIGNAL xk_im_temp : STD_LOGIC_VECTOR(22 DOWNTO 0);

SIGNAL dv_temp_vetor : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL amplitude_out : unsigned ( 22 DOWNTO 0);

SIGNAL start : STD_LOGIC := '0';

100

SIGNAL adrs : STD_LOGIC_VECTOR (9 DOWNTO 0):=

(others => '0');

SIGNAL xn_re : STD_LOGIC_VECTOR ( 11 DOWNTO 0 );

SIGNAL xn_im : STD_LOGIC_VECTOR ( 11 DOWNTO 0 );

SIGNAL start_sent : STD_LOGIC := '0';

SIGNAL counter : INTEGER RANGE 0 to 15;

SIGNAL fwd_inv : STD_LOGIC;

SIGNAL fwd_inv_we : STD_LOGIC;

SIGNAL locked : STD_LOGIC;

SIGNAL clk100 : STD_LOGIC;



SIGNAL clk12_5 : STD_LOGIC;



SIGNAL clk_fft : STD_LOGIC;

SIGNAL DCM_reset : STD_LOGIC;

SIGNAL fft_reset : STD_LOGIC;

SIGNAL busy_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL rst_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL clk_fft_vector: STD_LOGIC_VECTOR (0 TO 0);

SIGNAL busy_temp : STD_LOGIC;

SIGNAL clk_in : STD_LOGIC;

SIGNAL done_temp : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL start_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL xn_index_tmp : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

SIGNAL xk_index_tmp : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

begin

DCM_reset <= rst;

fft_reset <= rst;

clk_fft <= clk12_5;

clk_in <= clk;

DCM_inst: clock_divider_DCM

PORT MAP

(-- Clock in ports

CLK_IN1 => clk_in,

-- Clock out ports

CLK_OUT1 => clk100,

CLK_OUT2 => clk50,

CLK_OUT3 => clk25,

CLK_OUT4 => clk12_5,




101

RESET => '0',--DCM_reset,

LOCKED => locked

);

-- Stimulus process

PROCESS (fft_reset, clk_fft)

BEGIN

if (fft_reset = '1') then

fwd_inv <= '1';

fwd_inv_we <= '1';

counter <= 0;

elsif (rising_edge (clk_fft)) then

if ( counter = 1) then

fwd_inv_we <= '1';

end if;


fwd_inv_we <= '0';

end if;

if ( counter <10 ) then

counter <= counter + 1;

end if;

end if;

END PROCESS;

start_vector(0) <= start;

process (clk_fft, fft_reset)

begin

if ( fft_reset = '1') then

start <= '0';

adrs <= (others => '0');

start_sent <= '0';

elsif ( rising_edge (clk_fft) ) then

if ( start /= '1' and start_sent /= '1') then

start_sent <= '1';

start <= '1';

else

start <= '0';

102

end if;

if ( adrs < 1023) and (start_sent = '1') then

adrs <= adrs + 1;

end if;

end if;

end process;

ROM: mem

PORT MAP(

clka => clk_fft,

addra => adrs,

douta => xn_re

);

xn_im <= (others => '0');

fft_inst: xfft

PORT MAP (

clk => clk_fft,

start => start,

fwd_inv => fwd_inv,

fwd_inv_we => fwd_inv_we,

rfd => rfd,

busy => busy_temp,

edone => edone,

done => done_temp(0),

dv => dv_temp,

xn_re => xn_re,

xn_im => xn_im,

xn_index => xn_index,

xk_index => xk_index,

xk_re => xk_re_temp,

xk_im => xk_im_temp

);

amplitude_inst: r2p_corproc

GENERIC MAP(

DATA_WIDTH => 23,

PIPE_DEPTH => 15,

PRECISION => 23)

PORT MAP(

clk => clk_fft,

ce => '1',

Xin => SIGNED(xk_re_temp),

Yin => SIGNED(xk_im_temp),

Rout => amplitude_out

);

ILA_inst: ILA_CORE

103

port map(

CONTROL => control_word,

CLK => clk100,

TRIG0 => xn_re,

TRIG1 => xn_im,

TRIG2 => dv_temp_vetor,

TRIG3 => xk_re_temp,

TRIG4 => xk_im_temp,

TRIG5 => STD_LOGIC_VECTOR(amplitude_out),

TRIG6 => rst_vector,

TRIG7 => busy_vector,

TRIG8 => done_temp,

TRIG9 => start_vector );

amplitude <= STD_LOGIC_VECTOR(amplitude_out);

ICON_inst: ICON_CORE

port map(

CONTROL0 => control_word,

CONTROL1 => contro2_word

);

VIO_inst: VIO_core

port map(

CONTROL => contro2_word,

CLK => clk100,

SYNC_IN => ("0000000"&dv_temp),

SYNC_OUT => open);

dv_temp_vetor(0) <= dv_temp;

busy_vector(0) <= busy_temp;

rst_vector(0) <= rst;

clk_fft_vector(0) <= clk_fft;

busy <= busy_temp;

dv <= dv_temp;

xk_re <= xk_re_temp;

xk_im <= xk_im_temp;

done <= done_temp(0);

end spectogram_arch;

104

2D FFT VHDL code for image processing LIBRARY IEEE;





use IEEE.std_logic_unsigned.all;

LIBRARY UNISIM;

USE UNISIM.VComponents.ALL;

ENTITY spectogram IS

PORT (

clk : IN STD_LOGIC ;

rst : IN STD_LOGIC;

--start : IN STD_LOGIC := 'X';

rfd : OUT STD_LOGIC;

busy : OUT STD_LOGIC;

edone : OUT STD_LOGIC;

done : OUT STD_LOGIC;

dv : OUT STD_LOGIC;

--xn_re : IN STD_LOGIC_VECTOR ( 11 DOWNTO 0 );

--xn_im : IN STD_LOGIC_VECTOR ( 11 DOWNTO 0 );

xn_index : OUT STD_LOGIC_VECTOR ( 5 DOWNTO 0 );

xk_index : OUT STD_LOGIC_VECTOR ( 5 DOWNTO 0 );

xk_re : OUT STD_LOGIC_VECTOR ( 16 DOWNTO 0 );

xk_im : OUT STD_LOGIC_VECTOR ( 16 DOWNTO 0 );

amplitude: OUT STD_LOGIC_VECTOR (16 DOWNTO 0)

);

END spectogram;

ARCHITECTURE spectogram_arch OF spectogram IS

COMPONENT xfft

port (

clk : in STD_LOGIC := 'X';

start : in STD_LOGIC := 'X';

fwd_inv : in STD_LOGIC := 'X';

fwd_inv_we : in STD_LOGIC := 'X';

rfd : out STD_LOGIC;

busy : out STD_LOGIC;

105

edone : out STD_LOGIC;

done : out STD_LOGIC;

dv : out STD_LOGIC;

xn_re : in STD_LOGIC_VECTOR ( 9 downto 0 );

xn_im : in STD_LOGIC_VECTOR ( 9 downto 0 );

xn_index : out STD_LOGIC_VECTOR ( 5 downto 0 );

xk_index : out STD_LOGIC_VECTOR ( 5 downto 0 );

xk_re : out STD_LOGIC_VECTOR ( 16 downto 0 );

xk_im : out STD_LOGIC_VECTOR ( 16 downto 0 )

);

END COMPONENT;

COMPONENT ILA_CORE

PORT (

CONTROL: INOUT STD_LOGIC_VECTOR(35 DOWNTO 0);

CLK: IN STD_LOGIC;










TRIG9: IN STD_LOGIC_VECTOR(0 TO 0));

END COMPONENT;

COMPONENT ICON_CORE

PORT (

CONTROL0: INOUT std_logic_vector(35 DOWNTO 0);

CONTROL1: inout std_logic_vector(35 downto 0));

END COMPONENT;

COMPONENT VIO_core

port (

CONTROL: inout std_logic_vector(35 downto 0);

CLK: in std_logic;

SYNC_IN: in std_logic_vector(7 downto 0);

SYNC_OUT: out std_logic_vector(7 downto 0));

END COMPONENT;

COMPONENT r2p_corproc

GENERIC(DATA_WIDTH : INTEGER := 27;

PIPE_DEPTH : INTEGER := 15;

PRECISION : INTEGER := 27);

PORT( clk : IN STD_LOGIC;

ce : IN STD_LOGIC;

Xin : IN SIGNED(DATA_WIDTH-1 DOWNTO 0);

Yin : IN SIGNED(DATA_WIDTH-1 DOWNTO 0);

Rout : OUT unsigned(DATA_WIDTH-1 DOWNTO 0));

END COMPONENT;

106

COMPONENT mem

PORT (

clka : IN STD_LOGIC;

wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0);

addra : IN STD_LOGIC_VECTOR(11 DOWNTO 0);

dina : IN STD_LOGIC_VECTOR(16 DOWNTO 0);

douta : OUT STD_LOGIC_VECTOR(16 DOWNTO 0)

);

END COMPONENT;

COMPONENT clock_divider_DCM

PORT

(-- Clock in ports

CLK_IN1 : in std_logic;

-- Clock out ports








RESET : in std_logic;

LOCKED : out std_logic

);

END COMPONENT;

SIGNAL control_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL contro2_word : STD_LOGIC_VECTOR(35 DOWNTO 0);

SIGNAL dv_temp : STD_LOGIC;

SIGNAL xk_re_temp : STD_LOGIC_VECTOR(16 DOWNTO 0);

SIGNAL xk_im_temp : STD_LOGIC_VECTOR(16 DOWNTO 0);

SIGNAL xn_re : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

SIGNAL xn_im : STD_LOGIC_VECTOR ( 9 DOWNTO 0 );

SIGNAL adrs : STD_LOGIC_VECTOR (11 DOWNTO 0):= (others => '0');

SIGNAL start_sent : STD_LOGIC := '0';

SIGNAL start_trans : STD_LOGIC;

SIGNAL wr_en_vec : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL re_mem_out : STD_LOGIC_VECTOR(16 DOWNTO 0);

SIGNAL im_mem_out : STD_LOGIC_VECTOR(16 DOWNTO 0);

SIGNAL first_trans : STD_LOGIC;

SIGNAL row_offset : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');

SIGNAL col_offset : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');

SIGNAL adrs_plus_1: STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');

SIGNAL fft_on_row : STD_LOGIC;

SIGNAL adrs_plus_1_by_64: STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others =>

'0');

SIGNAL ram_adrs : STD_LOGIC_VECTOR( 11 DOWNTO 0):= (others => '0');

SIGNAL first_transaction: STD_LOGIC;

SIGNAL image_fft_done : STD_LOGIC;

107

SIGNAL width_counter : INTEGER RANGE 0 to 100;

SIGNAL dv_temp_vetor : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL amplitude_out : unsigned ( 16 DOWNTO 0);

SIGNAL counter : INTEGER RANGE 0 to 15;

SIGNAL fwd_inv : STD_LOGIC;

SIGNAL fwd_inv_we : STD_LOGIC;

SIGNAL locked : STD_LOGIC;







SIGNAL clk_fft : STD_LOGIC;

SIGNAL DCM_reset : STD_LOGIC;

SIGNAL fft_reset : STD_LOGIC;

SIGNAL busy_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL rst_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL clk_fft_vector: STD_LOGIC_VECTOR (0 TO 0);

SIGNAL busy_temp : STD_LOGIC;

SIGNAL clk_in : STD_LOGIC;

SIGNAL done_temp : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL start_vector : STD_LOGIC_VECTOR (0 TO 0);

SIGNAL after_rst : STD_LOGIC;

SIGNAL dv_NE : STD_LOGIC;

SIGNAL start_sent_RE : STD_LOGIC;

SIGNAL start_sent_q : STD_LOGIC;

SIGNAL dv_temp_q : STD_LOGIC;

SIGNAL start_sent_q_q : STD_LOGIC;

begin

DCM_reset <= rst;

fft_reset <= rst;

clk_fft <= clk12_5;

clk_in <= clk;

wr_en_vec(0) <= dv_temp;

DCM_inst: clock_divider_DCM

PORT MAP

(-- Clock in ports

CLK_IN1 => clk_in,

-- Clock out ports

CLK_OUT1 => clk100,

CLK_OUT2 => clk50,

CLK_OUT3 => clk25,





RESET => '0',--DCM_reset,

LOCKED => locked

);

108

-- Stimulus process

PROCESS (fft_reset, clk_fft)

BEGIN

if (fft_reset = '1') then

fwd_inv <= '1';

fwd_inv_we <= '1';

counter <= 0;

elsif (rising_edge (clk_fft)) then


fwd_inv_we <= '1';

end if;


fwd_inv_we <= '0';

end if;

if ( counter <10 ) then

counter <= counter + 1;

end if;

end if;

END PROCESS;

start_vector(0) <= start_trans;

start_trans <= start_sent_RE;

first_transaction <= first_trans and start_sent_q_q;

process (clk_fft, rst)

begin

if ( rst = '1') then


start_sent <= '0';

first_trans <= '1';

row_offset <= (others => '0');

col_offset <= (others => '0');

adrs_plus_1 <= "000000000001";

fft_on_row <= '1';

image_fft_done <= '0';

after_rst <= '1';

width_counter <= 0;

dv_NE <= '0';

start_sent_RE <= '0';

elsif ( rising_edge (clk_fft)) then

109

if (image_fft_done = '0') then

after_rst <= '0';

start_sent_q <= start_sent;

start_sent_q_q <= start_sent_q;

if (start_sent = '1' and start_sent_q = '0') then


else


end if;

dv_temp_q <= dv_temp;

if ( dv_temp = '1') then

first_trans <= '0';

end if;

if (dv_temp = '0' and dv_temp_q = '1') then

dv_NE <= '1';

else

dv_NE <= '0';

end if;

if ( after_rst = '1' or dv_NE = '1' ) then

start_sent <= '1';

elsif (width_counter = 63) then

start_sent <= '0';

end if;

if (start_sent = '1') then

width_counter <= width_counter + 1;

else

width_counter <= 0;

end if;

if (dv_NE = '1') then

if (fft_on_row = '1' ) then

-- go to next row

row_offset <= row_offset + 64;

else

col_offset <= col_offset + 1;

end if;

end if;

if ( fft_on_row = '1') then

if ( adrs <= 63 and start_sent = '1') then

adrs <= adrs + 1;

adrs_plus_1 <= adrs_plus_1 + 1;

ram_adrs <= adrs + row_offset ;

end if;

if (adrs <= 63 and dv_temp = '1') then

adrs <= adrs + 1;


ram_adrs <= adrs_plus_1 + row_offset ;

end if;

else

if ( adrs <= 63 and start_sent = '1') then

adrs <= adrs + 1;


110

adrs_plus_1_by_64 <= adrs_plus_1 (5

downto 0) &"000000";

-- offset + ( adrs + 1) * 64

ram_adrs <= col_offset +

((adrs (5 downto 0) &"000000"));

end if;

if ( adrs <= 63 and dv_temp = '1')then

adrs <= adrs + 1;


adrs_plus_1_by_64 <= adrs (5 downto 0)

&"000000";

-- offset + ( adrs + 1) * 64

ram_adrs <= col_offset + ((adrs_plus_1 (5

downto 0) &"000000"));

end if;

end if;

if (row_offset = 4032 and dv_NE = '1') then

fft_on_row <= '0'; -- start fft on the column

end if;

if (col_offset = 63 and ram_adrs = 4095) then

image_fft_done <= '1';

end if;

if (adrs = 64 ) then


adrs_plus_1 <= "000000000001";

if (fft_on_row = '1' ) then

ram_adrs <= row_offset;

else

ram_adrs <= col_offset;

end if;

end if;

end if;

end if;

end process;

real_part_mem: mem

PORT MAP(

clka => clk_fft,

wea => wr_en_vec,

addra => ram_adrs,

dina => xk_re_temp,

douta => re_mem_out

);

111

xn_re <= re_mem_out(9 downto 0) when (first_transaction = '1')

else re_mem_out(16 downto 7);

imaginary_part_mem: mem

PORT MAP(

clka => clk_fft,

wea => wr_en_vec,

addra => ram_adrs,

dina => xk_im_temp,

douta => im_mem_out

);

xn_im <= (others => '0') when (first_transaction = '1')

else im_mem_out(16 downto 7);

fft_inst: xfft

PORT MAP (

clk => clk_fft,

start => start_trans,--start,

fwd_inv => fwd_inv,

fwd_inv_we => fwd_inv_we,

rfd => rfd,

busy => busy_temp,

edone => edone,

done => done_temp(0),

dv => dv_temp,

xn_re => xn_re,

xn_im => xn_im,

xn_index => xn_index,

xk_index => xk_index,

xk_re => xk_re_temp,

xk_im => xk_im_temp);

amplitude_inst: r2p_corproc

GENERIC MAP(

DATA_WIDTH => 17,

PIPE_DEPTH => 15,

PRECISION => 17)

PORT MAP(

clk => clk_fft,

ce => '1',

Xin => SIGNED(xk_re_temp),

Yin => SIGNED(xk_im_temp),

Rout => amplitude_out);

ILA_inst: ILA_CORE

port map(

CONTROL => control_word,

CLK => clk100,

TRIG0 => xn_re,

TRIG1 => xn_im,

TRIG2 => dv_temp_vetor,

TRIG3 => xk_re_temp,

TRIG4 => xk_im_temp,

TRIG5 => STD_LOGIC_VECTOR(amplitude_out),

112

TRIG6 => rst_vector,

TRIG7 => busy_vector,

TRIG8 => done_temp,

TRIG9 => start_vector );

amplitude <= STD_LOGIC_VECTOR(amplitude_out);

ICON_inst: ICON_CORE

port map(

CONTROL0 => control_word,

CONTROL1 => contro2_word);

VIO_inst: VIO_core

port map(

CONTROL => contro2_word,

CLK => clk100,

SYNC_IN => ("0000000"&dv_temp),

SYNC_OUT => open);

dv_temp_vetor(0) <= dv_temp;

busy_vector(0) <= busy_temp;

rst_vector(0) <= rst;

clk_fft_vector(0) <= clk_fft;

busy <= busy_temp;

dv <= dv_temp;

xk_re <= xk_re_temp;

xk_im <= xk_im_temp;

done <= done_temp(0);

end spectogram_arch;

113

and Fast Fourier transform (FFT)

Documents